9

Is it possible for BULK INSERT/bcp to read from a named pipe, fifo-style?

That is, rather than reading from a real text file, can BULK INSERT/bcp be made to read from a named pipe which is on the write end of another process?

For example:

  1. create named pipe
  2. unzip file to named pipe
  3. read from named pipe with bcp or BULK INSERT

or:

  1. create 4 named pipes
  2. split 1 file into 4 streams, writing each stream to a separate named pipe
  3. read from 4 named pipes into 4 tables w/ bcp or BULK INSERT

The closest I've found was this fellow (site now unreachable), who managed to write to a named pipe w/ bcp, with a his own utility and usage like so:

start /MIN ZipPipe authors_pipe authors.txt.gz 9
bcp  pubs..authors out  \\.\pipe\authors_pipe -T -n

But he couldn't get the reverse to work.

So before I head off on a fool's errand, I'm wondering whether it's fundamentally possible to read from a named pipe w/ BULK INSERT or bcp. And if it is possible, how would one set it up? Would NamedPipeServerStream or something else in the .NET System.IO.Pipes namespace be adequate?

eg, an example using Powershell:

[reflection.Assembly]::LoadWithPartialName("system.core")
$pipe = New-Object system.IO.Pipes.NamedPipeServerStream("Bob")

And then....what?

Peter Radocchia
  • 10,710
  • 2
  • 34
  • 56
  • What happens when you replace `out` with `in`? It should work... – Aaronaught Feb 04 '10 at 02:57
  • I haven't tried myself, but the fellow reported it didn't work. – Peter Radocchia Feb 04 '10 at 03:18
  • You may beable to do it programmatically with the SqlBulkCopy class (I posted it as a comment because it's just an hint) – momobo Feb 04 '10 at 11:31
  • 1
    Yes, it can be done programmatically w/ SqlBulkCopy, or in SSIS of course. On Unix-like systems, mkfifo will make a "fifo" or named pipe on the file system that looks *just like an ordinary file*, and then you set up one process to write to it and one process to read. The data is then piped through the fifo, never being materialized on disk. This is the type of process I am trying to replicate. – Peter Radocchia Feb 04 '10 at 14:56
  • If you're trying bulk load data into the server, and don't mind writing a .NET assembly to do it, why not just use the bulk insert functionality in ADO.NET? http://msdn.microsoft.com/en-us/library/7ek5da1a.aspx – Chris Smith Mar 15 '12 at 02:02

4 Answers4

5

Unfortunately, both SSIS flat file adaptors, BULK INSERT and BCP take an exclusive write lock on the file (even though it does not actually write to it). This is why this doesn't work.

I am not sure pipes can be set up to allow two exclusive locks on the same pipe without some serious hacking. You could detour it I suppose or hack into fltmgr.sys :)

As the other posters suggested, using the .NET API to do bulk or the OLEDB or ODBC interface instead is likely simpler, even though it means you have to write your own file parser.

Thomas Kejser
  • 1,264
  • 1
  • 10
  • 30
5

I have succeeded in getting BULK INSERT (but not BCP) to work correctly with named pipes on Windows 7 ans SQL Server 2008R2. There are some tricks.

First, I had to create two named pipe instances on two different threads, both with the same pipe name. SQL Server would open the first instance, read a few bytes from it, and close it, causing WriteFile to raise a PipeException in the first thread. SQL Server would then immediately reopen the named pipe, and stream in all of the data from it. If I didn't have a second thread sitting in the background ready to serve the data, SQL server would return an error before my first thread had time to recover from the PipeException.

Second, I had to write all of the data in a single call to WriteFile. I started with a loop where I wrote multiple batches to the pipe, but BULK INSERT only used the first batch that I wrote. It seems to do a non-blocking read, and treat any read that returns zero bytes as an end-of-file.

Third, an XML format file, if used, must be written to a regular file. I have not succeeded in getting SQL Server to read the format file from a pipe. I don't know if it can read a non-XML format file from a pipe.

Dan Menes
  • 6,667
  • 1
  • 33
  • 35
  • Dan, this is very interesting, thank you for posting the steps you took to make it work. I apologize for not replying earlier, I've been away from SO for a while, and I also have not tried you technique myself, but I don't see any reason it wouldn't work for, say, extracting a zip file to a named pipe and reading it directly into the database. What errors did you get with bcp? – Peter Radocchia Aug 15 '13 at 02:20
  • If I recall correctly, BCP would complain of an unexpected end of file. Based on what I've read on the Web, I think what is happening is that BCP attempts to do a seek() on its input file, which fails if the input is a pipe. – Dan Menes Aug 15 '13 at 15:04
  • I can confirm that this method works, though I'm struggling with the fact that SQLServer cuts of the pipe prematurely. It takes 1024 characters. Still investigating what's the source of this limit. – Grimace of Despair Oct 24 '13 at 14:40
5

I'd comment on @DanMenes (thanks for the inspiration), but for reference purposes, I'm adding it as a separate answer.

I've worked out a solution in .NET which opens up a pipe (actually 2, first one gets destroyed like @DanMenes said), prepares streaming the data to it and then starts the BULK INSERT with an auto-generated format file.

The premise is that I can do stuff like

  var inMemoryData = new[] {
    new[] { "val1", "val2" },
    new[] { "val3", "val4" },
  };

  using (var importer = new Importer(SqlConnection, "MyTable", "Col1", "Col2"))
  {
    importer.Import(inMemoryData);
  }

I'll summarize the implementation of Importer:

1. Create the pipe

var stream = new NamedPipeServerStream(name, PipeDirection.Out, 2, PipeTransmissionMode.Byte, PipeOptions.Asynchronous);
stream.BeginWaitForConnection(OnConnection, this);

2. Accept connections

public void OnConnection(IAsyncResult asyncResult)
{
  Stream.EndWaitForConnection(asyncResult);

  var buffer = Encoding.UTF8.GetBytes(data);
  Stream.Write(buffer, 0, buffer.Length);
  Stream.Close();
}

3. Start BULK INSERT

var insertCommand = DbConnection.CreateCommand();
insertCommand.CommandText = "BULK INSERT [MyTable] FROM '\\.\pipe\mypipe' WITH (FORMATFILE='c:\path\to\formatfile')";
insertCommand.ExecuteNonQuery();

See the GitHub project for more details.

Note: I have yet to add performance tests to the project, but preliminary tests did show performance gains between 2x and 5x with respect to transactional INSERTs.

Grimace of Despair
  • 3,436
  • 26
  • 38
  • do you why this solution works for bulk insert and not for bcp? – Swapnil Jul 13 '15 at 10:50
  • I can only point you to the comments on @DanMenes reaction: might have something to do with bcp trying to seek() no the input file. – Grimace of Despair Jul 22 '15 at 12:26
  • bulk insert command can only be used to load data into the server on which named pipe is created. I am using sql server agent job to load the data into some different server. Unfortunately, i can not use bulk insert with named to load the data. Is there any way to achieve this? – Swapnil Jul 23 '15 at 06:19
  • I receive below error when I replace **“\\.\pipe\mypipe”** by **"\\System.Environment.MachineName\pipe\mypipe"**: _System.Data.SqlClient.SqlException (0x80131904): Bulk load: An unexpected end of file was encountered in the data file_. Do you know the reason for this error and how to fix it? – Swapnil Aug 24 '15 at 10:20
  • 1
    Hi... Just playing with the code. We have a big (20 gig) file getting dropped in a 3g .tar.gz. I was trying to see if I could hook a C# GZipStream to a C# tar reader and use _gzipStream.CopyTo (PipeStream) through your implementation. I ran into a couple of things. First - that throw-away named pipe in the code. The empty datavalues results in a write of 1 null byte on the pipe. Turns out that was important. Without it, the Sql side throws an "incomplete" error immediately and never gets past the throwaway. – user1664043 May 26 '21 at 21:25
  • 1
    Second, _gzipStream.CopyTo (PipeStream) immediately gets the "pipe is broken" error immediately on the write, unfortunately. Sucking a 20 gig file into memory and doing a single write with a 20 gig buffer sadly does not seem practical. Having to have the entire thing executed in a single write does seem to limit the use cases where one could do this. – user1664043 May 26 '21 at 21:30
  • From the SqlServer perspective, I don't see how the source of the data could be important, knowing it's just being fed from the calling managed code. So it would seem strange to me that the receiving pipe would act differently. Then again, I might be missing something here. You _do_ unzip while streaming, right? – Grimace of Despair May 27 '21 at 07:33
  • Yeah, the GZipStream is decompressing. I guess I was focusing more on the comment in the code that all the data had to be written in one big write. First, for large data sets, it seems like that may take long enough for the read to time out. Second, it seems like it would limit the applicability of the code to smaller data sets. I was hoping that comment wouldn't be literally true. I haven't looked in Reflector, but I assume CopyTo has a buffer internally and used a while loop, grabbing a buffer off the source stream and writing it to the sink stream. I hoped it would be fast enough – user1664043 May 27 '21 at 13:14
  • 1
    I ran my sample again, and it appears I'd gotten about 240K written to the named pipe before I got the "broken pipe" error. Stream.CopyTo() works in 80K buffer chunks, so I got 3 buffers written before it broke on the 4th – user1664043 May 27 '21 at 20:38
-1

Does BCP accept STDIN? If so, you might want to try just piping it straight through without creating a named pipe...for instance:

gunzip authors.txt.gz | bcp schema.tablename
slm
  • 15,396
  • 12
  • 109
  • 124
  • 2
    Don't you think if it was that simple, the OP wouldn't be asking this question? So, no, `bcp` does not accept `STDIN`: https://connect.microsoft.com/SQLServer/feedback/details/489777/bcp-and-stdio – underscore_d Oct 08 '16 at 12:04