Questions tagged [hawq]

This tag is for questions about Pivotal HAWQ, a SQL on Hadoop implementation

Pivotal HAWQ supports low-latency analytic SQL queries, coupled with massively parallel machine learning capabilities, to shorten data-driven innovation cycles for the enterprise. HAWQ enables discovery-based analysis of large data sets and rapid, iterative development of data analytics applications that apply deep machine learning. It reads data from and writes data to HDFS natively. Using HAWQ functionality, you can interact with petabyte range data sets. HAWQ provides users with a complete, standards-compliant SQL interface to Hadoop.

Homepage

Official Documentation

126 questions
0
votes
0 answers

How do i convert MSSQL query to Postgres Query

I have to migrate complex SQL query need to convert in Postgres. Complex SQL query : More than 4 table Join , lots of filter , Aggregate functions, CASE when then etc. For Ex: Sample input Select ROW_NUMBER() OVER (ORDER BY getdate())…
NEO
  • 389
  • 8
  • 31
0
votes
2 answers

HAWQ. join in/out rows by in/out time

HAWQ. How to join in/out rows by in/out time? simple thanks
Kobra
  • 313
  • 1
  • 15
0
votes
1 answer

Greenplum - How To Handle Deadlock

When try to run SQL transaction from Greenplum. getting this error. Transaction (Process ID 52) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction. We Tried : On SQL server it…
NEO
  • 389
  • 8
  • 31
0
votes
1 answer

virtual segment memory/core allocation in Apache Hawq

I am trying to tweak below Hawq configurations at session level for a query- SET hawq_rm_stmt_nvseg = 40; SET hawq_rm_stmt_vseg_memory = '4gb'; Hawq is running on Yarn resource manager with Minumum Hawq queue Used capacity…
S. K
  • 495
  • 2
  • 7
  • 14
0
votes
2 answers

HAWQ data to replicate between clusters

I have a requirement, I need to refresh the production HAWQ database to QA environment on daily basis. How to move the every day delta into QA cluster from Production. Appreciate your help Thanks Veeru
0
votes
1 answer

ERROR: CANNOT PARALLELIZE AN UPDATE STATEMENT THAT UPDATES THE DISTRIBUTION COLUMNS

When trying to copy data from source (MSSQLSERVER) TO target (greenplum database) using talend ETL server. Description: When executing an UPDATE statement to GreenPlum, the mentioned error is thrown. GIVEN No of records fetching to target is ~ 0.3…
NEO
  • 389
  • 8
  • 31
0
votes
2 answers

Error while executing kmean using madlib library on Greenplum

I am trying to run kmean algorithm using madlib library, tool used aginity tried executing : SELECT * FROM madlib.kmeans_random('select "MPrice" as "MPrice" from…
vkumar
  • 31
  • 1
  • 10
0
votes
2 answers

ERROR: value too long for type character(50)

I have created external table in HDFS and internal table in HAWQ. I am fetching data from SQL Server, using talend for etl process Process flow is like SQLSERVER -> EXTERNAL TABLE(PXF HAWQ) -> INTERNAL TABLE(HAWQ) On running the job I am getting…
vkumar
  • 31
  • 1
  • 10
0
votes
1 answer

Where can I find the location of distributed file on slaves using Apache HAWQ?

I am using Apache HAWQ and trying to handle some data. I have one master node and two hawq slaves. I made table, inserted the data and identified the data that I inserted using postgreSQL. I thought that the data was mostly distributed on…
sclee1
  • 1,095
  • 1
  • 15
  • 36
0
votes
1 answer

Difference between external table and internal table when using Apache HAWQ?

I am using HAWQ to handle a column-based file. While reading the Pivotal document, they suggest that user should use gpfdist to read and write the readable external table in order to quickly process the data in a parallel way. I made a table as…
sclee1
  • 1,095
  • 1
  • 15
  • 36
0
votes
1 answer

Error when testing gpload based on windows

When i try to execute gpload from Windows based ETL host. Using gpload in a Windows environment produces the following error: Error I Get: gpload.py -f gpload.yml gpload was unable to import The PyGreSQL Python module (pg.py) - DLL load failed…
NEO
  • 389
  • 8
  • 31
0
votes
0 answers

Pivotal Greenplum - gpload issue on Windows

When i try to execute gpload program from Windows server. It is failing due to the Error. Error I Get : ERROR | could not connect to database: global name 'pg' Is the Greenplum Database running on port 5432? We tried: 1) CHECKED env variable…
NEO
  • 389
  • 8
  • 31
0
votes
1 answer

Pivotal Greenplum - Incremental Data issue

When i try to capture Incremental Load in One SQL transaction. Update is not working. Basically, It Keeps on Executing for infinite time for 90k rows. Input SQL transaction BEGIN; INSERT INTO IncrementalLoad_Dest.dbo.tblDest (ColID, ColA, ColB,…
NEO
  • 389
  • 8
  • 31
0
votes
1 answer

Pivotal greenplum - gpload issue with talend

When i try to run the gpload process from talend etl server.In that,I need to configure tgreenpluGPload Component first. While configuration to component it is looking for Remote Greenplum server files instead of Local windows based talend ETL files…
NEO
  • 389
  • 8
  • 31
0
votes
1 answer

Pivotal GPDB :How to run queries without double quotes on tables and columns

When I try to query from greenplum.Double quotes Including take time in select list of columns while querying to database. Input DDL : Scenario is CREATE TABLE "People" ( "ID" SERIAL NOT NULL, "Email" TEXT NOT NULL, PRIMARY KEY(id) ); Error I…
NEO
  • 389
  • 8
  • 31
1 2 3
8 9