Questions tagged [data-integration]

Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. A complete data integration solution encompasses discovery, cleansing, monitoring, transforming and delivery of data from a variety of sources.

Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. A complete data integration solution encompasses discovery, cleansing, monitoring, transforming and delivery of data from a variety of sources.

It is a huge topic for the IT because, ultimately aims to make all systems work seamlessly together.

Example with data warehouse

The process must take place among the organization's primary transaction systems before data arrives at the data warehouse.
It is rarely complete, unless the organization has a comprehensive and centralized master data management(MDM) system.

Data integration usually takes the form of conforming dimensions and facts in the data warehouse. This means establishing common dimensional attributes across separated databases. Conforming facts is making agreement on common business metrics such as key performance indicators (KPIs) across separated databases, so these numbers can be compared mathematically.

332 questions
2
votes
2 answers

Pentaho Data Integration Input / Output Bit Type Error

I am using Pentaho Data Integration for numerous projects at work. We predominantly use Postgres for our database's. One of our older tables has two columns that are set to type bit(1) to store 0 for false and 1 for true. My task is to synchronize a…
John B
  • 159
  • 3
  • 14
2
votes
1 answer

How to connect to a Secure Gateway's destination with TLS in DataWorks

I would like to load on-premise Oracle data to Bluemix dashDB. I plan to use DataWorks and Secure Gateway. It is required that only DataWorks can access to Secure Gateway. According to the tutorial Securing Destinations with TLS in Bluemix Secure…
shimac-jp
  • 233
  • 3
  • 11
2
votes
0 answers

JSON input in pentaho data integration

I have json inside json array coming from mongoDb. "{ "_id" : { ""$oid"" : ""54b76bce44ae90e9e919d6e1""} , "_class" : ""com.argusoft.hkg.nosql.model.HkEventDocument"" , "featureName" : "EVENT" , "instanceId" : 577 , "fieldValue" : {…
Shreya Shah
  • 582
  • 1
  • 4
  • 17
2
votes
2 answers

Pentaho Row Denormaliser Step Not Working

I have some sorted data that I'm trying to denormalize but the step in Pentaho isn't working correctly. Here is a snapshot of the sorted data: And here is a snapshot of the Row Denormaliser Step as I've configured it: What I get is: There are no…
Dezzie
  • 934
  • 3
  • 18
  • 35
2
votes
1 answer

Format was not found error when running an imported EG on DI

I have exported an Enterprise Guide project to Data Integration Studio. When I run it I get: Error: The format CHAR was not found or could not be loaded. Is there any way to fix it without changing the EG project's code? Thanks, Gal. * Edit * Code…
user2518751
  • 685
  • 1
  • 10
  • 20
2
votes
1 answer

ETL Tool Which is most configurable

I am looking for best suited ETL Tool for the following criteria. Supports MongoDB Accepts Metadata as input (Or accepts file and builds its metadata on the fly) provides configurable Mapping. (mapping can be defined from outside development, using…
Kaushal
  • 908
  • 1
  • 8
  • 19
2
votes
1 answer

Unable to connect to HDFS using PDI step

I have successfully configured Hadoop 2.4 in an Ubuntu 14.04 VM from a Windows 8 system. Hadoop installation is working absolutely fine and also i am able to view the Namenode from my windows browser. Attached Image Below: So, my host name is :…
Rishu Shrivastava
  • 3,745
  • 1
  • 20
  • 41
2
votes
1 answer

Data Integration: Bring Data to a Standard Format

I am trying to make a data integration process using ETL tool (Talend). The challenge I am facing is when I try to bring data from various sources (in different formats) into a single format. The sources may have different column names and…
Kaushal
  • 908
  • 1
  • 8
  • 19
2
votes
1 answer

How to add line numbers to a file in Pentaho Data Integration (Kettle)?

I have a file names.txt with this data: NAME;AGE; alberto;22 andrea;51 ana;16 and I want to add a new column N with the line number of the row: N;NAME;AGE; 1;alberto;22 2;andrea;51 3;ana;16 I've been looking and what I found was something related…
japmelian
  • 97
  • 3
  • 9
2
votes
1 answer

Can you set Fixed File Input column definitions dynamically in Pentaho data-integration (PDI)?

I have a metadata file which contains the column name, starting position, and length. I would like to read these values and define my columns within a FIXED FILE INPUT step. Is there a way to do this in PDI? My file contains over 200 columns at a…
bg80
  • 33
  • 3
2
votes
1 answer

Pentaho Data Integration (DI) Get Last File in a Directory of a SFTP Server

I am doing a transformation on Pentaho Data Integration and I have a list of files in a directory of my SFTP server. This files are named with FILE_YYYYMMDDHHIISS.txt format, my directory looks like…
2
votes
1 answer

Can JStestDriver be used to test js code within JSP files?

Quick question: It is possible to do unit testing, particularly with JStestDriver, on Javascript code written inside JSP files? Or I have to necessarily extract it into external js files?
1
vote
0 answers

Talend application scenarios: is it correct to have logical operators in the first term of GAV mapping?

I'm working on a data integration project that makes use of Talend through a first logical formalization step of the global schema. I choose to exploit the GAV mapping properties to connect my source schema (the tables and csv files from which I…
1
vote
1 answer

SAS SQL Pass-Through Facility does not work as expected for Postgres database

I am working with SCD Type 2 transformation in SAS Data integration Studio (4.905) and using Postgres (12) as database. I am facing the following error when I try to execute a query via passthrough: When using passthrough in Postgres, SCD Type 2…
1
vote
0 answers

How to figure out the body structure of a REST request job?

I'm trying to transform Talend jobs to Spring Boot REST services. But I can't figure out the body of this post request. Job 1 picture illustrates a JSON extract, and subjob picture is other in loop JSON extract, so I suppose the expected JSON is…