0

When I run the following program and paste 50000 symbols to the command line, the program gets 4096 symbols only. Could you please suggest me what to do in order to get the full list of symbols?

#include <iostream>
#include <string>

using namespace std;

int main()
{
    char temp[50001];
    while (cin.getline(temp, 50001, '\n'))
    {
        string s(temp);
        cout << s.size() << endl;
    }
    return 0;
}

P.S. When I read the symbols from file using fstream, it's OK

  • Thats a command line limitation – yizzlez Jun 08 '14 at 14:11
  • are you sure there are no delimiters in the input? Otherwise it could be a platform limit – Marco A. Jun 08 '14 at 14:11
  • I don't understand, why not read directly into the `std::string`? – Thomas Matthews Jun 08 '14 at 17:18
  • The problem is the same for directly reading into std:string or as it is posted here. if you succeeded to read all symbols, please share the code. – Eduard Bagrov Jun 08 '14 at 17:33
  • Does your file contain an EOF character? For example, on Windows platform, the Crtl-Z character represents an end of file. To verify this, use `std::cin.get` method to input the text. Use `std::cin.gcount` to get the number of characters actually read. – Thomas Matthews Jun 08 '14 at 17:39
  • My fundamental question is: Are 4096 characters input or are only 4096 characters transferred to the string? – Thomas Matthews Jun 08 '14 at 17:40
  • I pasted into command line 50000 characters with '\n' at the end, but 4095 characters transferred to the string – Eduard Bagrov Jun 08 '14 at 17:47
  • Maybe the limitation is the cut buffer? What happens if you redirect something directly into the input stream, like `type somefile | yourprogram.exe`? – Ulrich Eckhardt Jun 08 '14 at 18:18

2 Answers2

0

I'm taking a leap jump here but since many powershell terminals have 4096 truncation limits (take a look at the Out-File documentation), this is likely a Windows command line limitation rather than a getline limitation.

The same problem has been encountered previously by others: https://github.com/Discordia/large-std-input/blob/master/LargeStdInput/Main.cpp

Marco A.
  • 43,032
  • 26
  • 132
  • 246
  • Thank you. Can't we change the settings of VS to solve this issue? – Eduard Bagrov Jun 08 '14 at 17:35
  • it's not a VS limit AFAIK, but rather a console limit. And since it's by design they usually suggest to workaround the issue (as the other posts are suggesting) – Marco A. Jun 08 '14 at 18:17
0

I don't understand why you are reading into a character array, then transferring it into a string.

In any case, your issue may be with repeated allocations.

Reading into std::string directly
Two simple lines:

std::string s;
getline(cin, s, '\n');

Reading into an array first
Yes, there is a simpler method:

    #define BUFFER_SIZE 8196  // Very important, named constant
    char temp[BUFFER_SIZE];
    cin.getline(temp, BUFFER_SIZE, '\n');

//  Get the number of characters actually read
    unsigned int chars_read = cin.gcount();

    std::string s(temp, chars_read);  // Here's how to transfer the characters.

Using a debugger, you need to view the value in chars_read to verify that the quantity of characters read is valid.

Binary reading
Some platforms provide translations between the data read and your program. For example, Windows uses Ctrl-Z as an EOF character; Linux uses Ctrl-D.

The input data may use UTF encoding and contain values outside the range of ASCII printable set.

So, the preferred method is to read from a stream opened in binary mode. Unfortunately, cin cannot be opened easily in binary mode.
See Open cin in binary

The preferred method, if possible, is to put the text into a file and read from the file.

Community
  • 1
  • 1
Thomas Matthews
  • 56,849
  • 17
  • 98
  • 154
  • Thanks for your answer. You described two well-known methods of reading data into the std::string. Current post is related to reading the big number of symbols more than 4096. Please try to debug your both methods and you will see that if you paste into the command line 50000 characters length string, only 4095 will be stored in s, temp. – Eduard Bagrov Jun 08 '14 at 17:45
  • Suggesting the use of #define to create a "constant" is disgusting. What's the problem with using a constant instead of a macro? – Ulrich Eckhardt Jun 08 '14 at 18:19
  • Sorry, but it's a language compatibility technique. The C language does not allow constant integer variables to be used as array capacities. I'm used to writing code that compiles as both C and C++. – Thomas Matthews Jun 08 '14 at 18:23