We have small cluster of pivotal hadoop-hawq system. we have to read one external table.
ie select * from ext_table
But when i issued query in Hawq complaints about following error :
Error Hawq complaints for : missing data for column "SoldToAddr2"
We tried following :
We tried with diffrent special character on our format clause of ext_table definition :
for ex:
CREATE READABLE EXTERNAL TABLE ext_table
(
"ID" INTEGER,
time timestamp,
"Customer" char(7),
"Name" varchar,
"ShortName" char(10),
"ExemptFinChg" char(1),
"MaintHistory" char(1),
"CustomerType" char(1),
"MasterAccount" char(7),
"StoreNumber" char(7),
"PrtMasterAdd" char(1),
"CreditStatus" char(1),
"CreditLimit" decimal(14),
"InvoiceCount" decimal(7),
"Salesperson" char(3),
"Salesperson1" char(3),
"Salesperson2" char(3),
"Salesperson3" char(3),
"PriceCode" char(2),
"CustomerClass" char(2),
"Branch" char(2),
"TermsCode" char(2),
"InvDiscCode" char(2),
"BalanceType" char(1),
"Area" char(2),
"LineDiscCode" char(2),
"TaxStatus" char(1),
"TaxExemptNumber" char(30),
"SpecialInstrs" char(30),
"PriceCategoryTable" char(52),
"DateLastSale" date,
"DateLastPay" date,
"OutstOrdVal" decimal(16),
"NumOutstOrd" decimal(6),
"Telephone" char(20),
"Contact" varchar,
"AddTelephone" char(20),
"Fax" char(20),
"Telex" char(10),
"TelephoneExtn" char(5),
"Currency" text,
"UserField1" char(10),
"UserField2" decimal(16),
"GstExemptFlag" char(1),
"GstExemptNum" char(15),
"GstLevel" char(1),
"DetailMoveReqd" char(1),
"InterfaceFlag" char(1),
"ContractPrcReqd" char(1),
"BuyingGroup1" char(2),
"BuyingGroup2" char(2),
"BuyingGroup3" char(2),
"BuyingGroup4" char(2),
"BuyingGroup5" char(2),
"StatementReqd" char(1),
"BackOrdReqd" char(1),
"ShippingInstrs" char(30),
"StateCode" char(3),
"DateCustAdded" date,
"StockInterchange" char(1),
"MaintLastPrcPaid" char(1),
"IbtCustomer" char(1),
"SoDefaultDoc" char(1),
"CounterSlsOnly" char(1),
"PaymentStatus" char(1),
"Nationality" char(3),
"HighestBalance" decimal(16),
"CustomerOnHold" char(1),
"InvCommentCode" char(3),
"EdiSenderCode" char(40),
"RelOrdOsValue" decimal(16),
"EdiFlag" char(1),
"SoDefaultType" char(1),
"Email" char(50),
"ApplyOrdDisc" char(1),
"ApplyLineDisc" char(1),
"FaxInvoices" char(1),
"FaxStatements" char(1),
"HighInvDays" decimal(5),
"HighInv" char(6),
"DocFax" char(20),
"DocFaxContact" char(40),
"SoldToAddr1" char(150),
"SoldToAddr2" char(80),
"SoldToAddr3" char(40),
"SoldToAddr4" char(40),
"SoldToAddr5" char(40),
"SoldPostalCode" char(9),
"ShipToAddr1" char(40),
"ShipToAddr2" char(40),
"ShipToAddr3" char(40),
"ShipToAddr4" char(40),
"ShipToAddr5" char(40),
"ShipPostalCode" char(9),
"State" char(2),
"CountyZip" char(5),
"City" char(3),
"State1" char(2),
"CountyZip1" char(5),
"City1" char(3),
"DefaultOrdType" char(2),
"PoNumberMandatory" char(1),
"CreditCheckFlag" char(1),
"CompanyTaxNumber" char(15),
"DeliveryTerms" char(3),
"TransactionNature" decimal(5),
"DeliveryTermsC" char(3),
"TransactionNatureC" decimal(5),
"RouteCode" char(10),
"FaxQuotes" char(1),
"RouteDistance" decimal(6),
"TpmCustomerFlag" char(1),
"SalesWarehouse" text,
"TpmPricingFlag" char(1),
"ArStatementNo" char(2),
"TpmCreditCheck" char(1),
"WholeOrderShipFlag" char(1),
"MinimumOrderValue" decimal(12),
"MinimumOrderChgCod" char(6),
"UkVatFlag" char(1),
"UkCurrency" char(3),
"TimeStamp" bytea
)
LOCATION (
'pxf://hostname/path/to/hdfs?profile=HdfsTextSimple')
FORMAT 'CSV' (delimiter '^' null 'null' quote '"')
ENCODING 'UTF8';
Error Detail :
ie Bad rows encountered at line 20
error detail :DETAIL: External table tablename, line 20 of pxf://hostname/path/to/hdfs?profile=HdfsTextSimple: "23020^2015-12-02 11:14:26.52^0023482^Carlos iglesias ^ ^N^Y^ ^ ^ ..."
What is the better way to resolve bad errors in pivotal hadoop-hawq system ?
Any Help would be much appreciated ?