I am trying to read a PDF document using iTextSharp. The document is read, but somehow I notice that name is abbreviated. E.g. if the name is "procurement Define document", it will abbreviate the name to "Proc def doc". I am not sure what am I doing wrong, but I don't want to shorten the names.
Below is my code:
Imports System
Imports System.Collections.Generic
Imports System.Text
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Public Class _Default
Inherits System.Web.UI.Page
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Dim oReader As New iTextSharp.text.pdf.PdfReader("C:\4012014.pdf")
Dim sOut As StringBuilder = New StringBuilder()
For i = 1 To oReader.NumberOfPages
Dim its As New iTextSharp.text.pdf.parser.SimpleTextExtractionStrategy
Dim strLineText As String = iTextSharp.text.pdf.parser.PdfTextExtractor.GetTextFromPage(oReader, i, its)
strLineText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(strLineText)))
sOut.Append(strLineText)
Next
oReader.Close()
sOut.Append("<br/>")
txtTest1.Text = sOut.ToString()
End Sub
End Class