Questions tagged [kylo]

Kylo is an open-source data lake management platform. Ask programming related questions here. For other topics refer to Google groups.

Kylo is an open-source data lake management platform.

Kylo offers a turn-key, enterprise-ready data lake solution that integrates best practices around metadata management, governance, and security gleaned from Think Big's 150+ big data implementation projects.

For non programming related topics refer to Kylo Community on Google groups.

68 questions
2
votes
1 answer

Liquibase exception on Kylo start

I created a rpm from master branch and installed it on my HDP 2.4 sandbox  with mysql default metastore for Kylo. I am running into below issue after starting kylo service .  Did anyone encounter this before ? Caused by:…
Shashi
  • 2,686
  • 7
  • 35
  • 67
2
votes
3 answers

Kylo | High Water Mark functionality

I have feed running in every five minute and using load/release hive water mark feature. Consider a scenario where job execution took more than 5 minutes and water mark commit did not happen. In this scenario will Kylo launch another feed instance…
Shashi
  • 2,686
  • 7
  • 35
  • 67
2
votes
2 answers

Kylo | Visual Query Spark Job - Cluster Vs Client mode

By default visual query spark job runs in local mode. What is suggested setting for Visual Query when you are running Kylo into production with bigger chunk of data ? Thanks Shashi
Shashi
  • 2,686
  • 7
  • 35
  • 67
2
votes
1 answer

Kylo | Access Template Exception

I created a a role in Kylo and assigned only create/update feed permission. I was able to create a feed but when i tried to access it I got an access template exception. Any pointers here?
Shashi
  • 2,686
  • 7
  • 35
  • 67
2
votes
1 answer

Kylo | Read Only Permission for Running Feeds

How do I give read only permission for Operation Manager in Kylo to logged in user ? Permission Setting Ops Manager View Logged in user is able to Fail/Abandon running job. Thanks Shashi
Shashi
  • 2,686
  • 7
  • 35
  • 67
1
vote
0 answers

apache nifi 1.12.0 shows invalid component while adding kylo processor

when am trying to add kylo processor .nars in nifi, it will show as in image with comment Missing processor validated aginist any property is invalid and processor is of type com.thinkbiganalytics.v2.spark.ExecuteSparkJob but this is not a valid…
subinksoman
  • 426
  • 4
  • 20
1
vote
0 answers

Kylo security implementation, OpenLDAP implementation in Kylo with Kerberos

We Are trying to integrate Kylo with OpenLDAP and Kerberos, but it seems like there are no configuration changes suggested in Kylo doc. https://kylo.readthedocs.io/en/latest/security/KyloKerberosSPNEGO.html only auth-ad changes are suggested in the…
1
vote
0 answers

I want to set up kylo as a web application, not just an ingestion tool. and I want to know the risk and the better way to do this

is it perfectly OK to use kylo not just as an ingestion tool, but to Extending it to serve queries to read data from the hive or use APIs(custom) to run quires in the hive? will it pose any security risk to expose kylo with the hive to run quires?
1
vote
0 answers

when i create a new feed,i have a problem like this Error saving Feed Duplicate key ProcessGroupDTO:b20b995a-0165-1000-0479-b059731bba5b

when i create a new feed,i have a problem like this Error saving Feed Duplicate key ProcessGroupDTO:b20b995a-0165-1000-0479-b059731bba5b kylo-servers-log:
bule
  • 11
  • 2
1
vote
1 answer

Nifi UpdateAttribute not working for dynamic variable

I am trying to get the count of files processed by ListHDFS, so the flow looks like this: ListHDFS -> UpdateAttribute -> LogAttribute I configured UpdateAttribute as per documentation (see attachment). Strangely, I am not even seeing "fileCount" in…
Rakesh Prasad
  • 602
  • 1
  • 13
  • 32
1
vote
1 answer

running record count from SplitRecord processor Nifi

Is there a way to get fragment index from SplitRecord processor Nifi? I am splitting a very big xls (4 mill records) into "Records Per Split" = 100000. Now I want to just process first 2 splits, to see quality of the file and reject rest of the…
Rakesh Prasad
  • 602
  • 1
  • 13
  • 32
1
vote
1 answer

CSV to json with dynamic schema using NiFi

I am getting a CSV file from a 3rd party. Schema for this file is dynamic, the only thing I can be certain of is, each column with data will also have header name. file will always have a header. header name will always be a string of alphabets…
Rakesh Prasad
  • 602
  • 1
  • 13
  • 32
1
vote
2 answers

how to send multipart/form-data via InvokeHttp NiFi

I have a 3rd party REST, which I am successfully able to call like this using CURL (shell). This API return JSON. I tried calling same API, by changing content type to application/x-www-form-urlencoded but it doesn't work. I think I am forced to use…
Rakesh Prasad
  • 602
  • 1
  • 13
  • 32
1
vote
1 answer

NiFi processor task count is very high, what must be the reason?

I wrote a basic custom processor, which sends flow to "Retry" relation and also calling penalize. package nlsn.processors.core.main; import java.util.Collections; import java.util.HashSet; import java.util.List; import java.util.Set; import…
Rakesh Prasad
  • 602
  • 1
  • 13
  • 32
1
vote
1 answer

what is wrangler port in data transformation in NIFI?

I would like to know what is Wrangler input/cleanup port in DATA TRANSFORMATION template . for the data ingest template nifi providing input and clean up ports , where we can define properties, directories and all. But when comes to data…
IMRAN S K
  • 23
  • 3