Questions tagged [data-integration]

Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. A complete data integration solution encompasses discovery, cleansing, monitoring, transforming and delivery of data from a variety of sources.

Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information. A complete data integration solution encompasses discovery, cleansing, monitoring, transforming and delivery of data from a variety of sources.

It is a huge topic for the IT because, ultimately aims to make all systems work seamlessly together.

Example with data warehouse

The process must take place among the organization's primary transaction systems before data arrives at the data warehouse.
It is rarely complete, unless the organization has a comprehensive and centralized master data management(MDM) system.

Data integration usually takes the form of conforming dimensions and facts in the data warehouse. This means establishing common dimensional attributes across separated databases. Conforming facts is making agreement on common business metrics such as key performance indicators (KPIs) across separated databases, so these numbers can be compared mathematically.

332 questions
2
votes
2 answers

Talend Open-Studio Supported I/O Formats

I'm considering Talend's Open Studio for a Data Integration / ETL project, and I can't seem to find a list of formats for which it can input from or output to out of the box. For instance, I'm comparing it against Pentaho's Kettle, which I found…
weberc2
  • 7,423
  • 4
  • 41
  • 57
2
votes
2 answers

Is it possible to execute pentaho step in sequence?

I have a pentaho transformation which is consist of, for example, 10 steps. I want to start this job for N input parameters but not in parallel, each job evaluation should start after previous transformation are fully completed(process done in…
yura
  • 14,489
  • 21
  • 77
  • 126
2
votes
1 answer

How to change the data in a column in SAS Data Integration?

I have an existing ETL solution built-in SAS Data Integration, where one of the columns in initially set to have all null values. I want to populate that column with actual data. The original column in that table was set to receive numeric values…
Eyal Marom
  • 281
  • 4
  • 18
2
votes
2 answers

Where to download sun.jdbc.odbc.JdbcOdbcDriver (trying to connect output csv from Spoon to SSMS)

I have a csv that I have transformed in Kettle/Spoon/PDI and I am trying to output it to SSMS. In Spoon, it's a two step process: read the csv (and edit a couple types), then output to SQL. I get this error: "Driver class…
CD9999
  • 109
  • 1
  • 3
  • 14
2
votes
2 answers

SSIS Custom Component: Having any IsSorted Property and output metadata

I have a Custom Synchronous Component that works fine and I use it. Recently, I sent some Sorted data from a sort component to it (or an IsSorted=true Source Component) but then i couldn't use the output as the input of a merge join due to not…
2
votes
0 answers

Nightly data integration using NServiceBus

In the investment firm I work for, we have several daily data integration tasks that every night/early morning. Most if not all are done using SSIS, and they are all scheduled to start at certain times. They work (as in the SSIS packages do their…
zorrinn
  • 45
  • 1
  • 6
2
votes
2 answers

How to conduct Data Cleaning with Spark-Python based on HDFS

Currently, I focus on the data preprocessing in the Data Mining Project. To be specific, I want to do the data cleaning with PySpark based on HDFS. I'm very new to those things, so I want to ask how to do that? For example, there is a table in the…
2
votes
2 answers

Installing Silk Workbench on windows 10?

I am using java 8 on Windows 10 and I have to use Silk linked data integration tool. I downloaded the latest version of Silk workbench from github. I actually do not know what should I do with it. It is mentioned in the readme that the bin folder…
Reihan_amn
  • 2,645
  • 2
  • 21
  • 21
2
votes
3 answers

Talend and Apache Spark?

I am confused as to where Talend and Apache spark fit in the big data ecosystem as both Apache Spark and Talend can be used for ETL. Could someone please explain this with an example?
user2803194
  • 105
  • 2
  • 11
2
votes
3 answers

When to combine multiple apps to simplify their data integration?

Short version: We have multiple teams each developing multiple apps. They need to share some data. Should we combine these apps into one larger one to simplify data integration or should we keep them separate and utilize some data exchange/caching…
Robert Campbell
  • 6,848
  • 12
  • 63
  • 93
2
votes
2 answers

What is Snaplogic?

As per Wikipedia: SnapLogic is a commercial software company that provides Integration Platform as a Service (iPaaS) tools for connecting Cloud data sources, SaaS applications and on-premises business software applications. It is surely a…
Bilesh Ganguly
  • 3,792
  • 3
  • 36
  • 58
2
votes
2 answers

Replace null value with NA using Pentaho Kettle

I have an input csv file with one column field value as empty. I want to replace that field value as NA in my destination table. And in my destination table that column is specified as not null column. I tried using if field value is null, value…
Lavanya D.
  • 491
  • 2
  • 6
  • 15
2
votes
3 answers

Data integration between IBM AS400 to SQL Server database

I'm a web developer that has been tasked with creating some sort of mechanism for moving data from an IBM AS400 to a SQL server. Unfortunately, linked servers are out of the question in this case as the SQL Server is just Standard Edition (db2…
ghoston3rd
  • 129
  • 2
  • 5
  • 14
2
votes
3 answers

Eloqua BulkAPI 2.0 returns 404: There was a validation error

I am trying to create a contacts export in Eloqua following this tutorial. The outcome I experience is: HTTP/1.1 400 There was a validation error. { "failures": [{ "field": "name", "constraint": "Must be a string value, at least…
JenM
  • 21
  • 3
2
votes
2 answers

Data integration ETL with java web application?

I'm newbie with Business intelligence , and I'm going to develop a java web application. I want to integrate data from different sources so then I can store them in a database . Is there an API or jars of pentaho or talend or other ETLs that I can…
pietà
  • 760
  • 1
  • 11
  • 37
1 2
3
22 23