Questions tagged [data-integration]

Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. A complete data integration solution encompasses discovery, cleansing, monitoring, transforming and delivery of data from a variety of sources.

Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. A complete data integration solution encompasses discovery, cleansing, monitoring, transforming and delivery of data from a variety of sources.

It is a huge topic for the IT because, ultimately aims to make all systems work seamlessly together.

Example with data warehouse

The process must take place among the organization's primary transaction systems before data arrives at the data warehouse.
It is rarely complete, unless the organization has a comprehensive and centralized master data management(MDM) system.

Data integration usually takes the form of conforming dimensions and facts in the data warehouse. This means establishing common dimensional attributes across separated databases. Conforming facts is making agreement on common business metrics such as key performance indicators (KPIs) across separated databases, so these numbers can be compared mathematically.

332 questions
1
vote
1 answer

How to connect Power BI with the website database?

I have a Python -based website hosted on pythonanywhere.com. The website asks for feedback from the user and stores input in sqlite3. How do I access the website's database in Microsoft Power BI (preferably real-time or periodic otherwise) to…
Musaib Jan
  • 121
  • 8
1
vote
0 answers

How to execute Oracle functions from Pyspark script

I have a requirement to execute an Oracle function from Pyspark Script. Is there any way to execute Oracle functions/procedures without using additional python libraries such as cx_Oracle in Pyspark?
1
vote
0 answers

submit data from Database to Soap service using SSIS

I want to submit data from database to SOAP web service -already implemented-. I just want to consume it to submit data on it. I don't know what's the best way to do it. I tried to use SSIS and successfully called the service using static data, but…
1
vote
1 answer

SAS Data Integration : SAP table extract error

This is the error code : Line 115: ERROR: RFC_ERROR_SYSTEM_FAILURE Error in module RSQL of the database interface. NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements. ` Error Log : NOTE: Libref SAPENG was…
gensius
  • 243
  • 1
  • 2
  • 7
1
vote
1 answer

Preventing invalid JOINs using dimension/fact tables that contain NULL Foreign Keys

Goal We're trying to produce fact and dimension tables that will be easy for anybody to use. Many modern BI systems promote exploration and experimentation and we want people of all skill levels to be successful. Problem Our data has tons of…
1
vote
1 answer

Pentaho data integration with REST

I am trying to connect to a rest API over SSL with un/pwd authentication. I am able to browse the URL - however when I run the job nothing happens. Essentially I just want to connect tot he server and output the data in an xml file. Thank you in…
akaphenom
  • 6,728
  • 10
  • 59
  • 109
1
vote
1 answer

Need help in reducing the complexity or duplication in the function

Hi can someone please help me in reducing the complexity of the below mentioned code as I am new to this I need it to reduce the amount of code and improve the code and to improve simplicity and reduce duplications in the overall coding any any help…
1
vote
1 answer

How much of Talend functionality is translated in SQL-Query and how much in Java?

I am facing an internship and they asked me to learn how to use talend ETL. I did it, not so difficult. One of the extra-tasks that have been assigned to me is to verify how much of the operations I set on the design workspace is executed in java…
1
vote
0 answers

Pentaho Kettle Failed to parse Number: For input string: "null" error

I am new to Pentaho Kettle and I am trying to read simple data (.DBF file) with Xbase. But I keep getting errors when reading my DBF data file The error is: ... Unable to read row from XBase file; Failed to parse Number: For input string:…
Elsa
  • 1
  • 1
  • 8
  • 27
1
vote
2 answers

Integration after a merger: Camel or XAware?

Following a merger of two companies, what would be the best tool for enterprise integration: - Camel or XAware? - or both for different needs? It seems that there is some overlap with maybe XAware more focused on data integration and Camel having a…
1
vote
1 answer

Creating Hierarchical data with a single Column in Talend

I have a dataset that Looks like : Dataset In talend, and I have to create Hierarchical data using these columns, sample output is: Sample Output I can do it using TJavaRow but I cannot code, I have to do it purely using Talend components. So far…
1
vote
0 answers

Pentaho JsonInput GET fields

I'm trying to use PDI to read data from an API (json) and now I'm simply trying to use json input to get a few specific fields but the get fields button on the input step gives me. ERROR (version 8.3.0.0-371, build 8.3.0.0-371 from 2019-06-11…
Tony
  • 8,681
  • 7
  • 36
  • 55
1
vote
2 answers

Continuously replicate data from Oracle to ElasticSearch

Team where I'm working has a luck to work on redesigning huge legacy system, with Oracle 12 on database end. Currently this monster software has 10% of insert/update/delete operations in DB, rest 90% of operations are the select operations…
1
vote
1 answer

Pentaho Data Integration Import large dataset from DB

I'm trying to import a large set of data from one DB to another (MSSQL to MySQL). The transformation does this: gets a subset of data, check if it's an update or an insert by checking hash, map the data and insert it into MySQL DB with an API…
1
vote
1 answer

How to make requests in third party APIs and load the results periodically on google BigQuery? What google services should I use?

I need to get the data from a third party API and ingest it in google BigQuery. Perhaps, I need to automate this process through google services to do it periodically. I am trying to use Cloud Functions, but it needs a trigger. I have also read…