3

I am trying to extract text from a PDF file using MuPDF library in Android platform.

Is it possible to extract text within a rectangle specified by coordinates (left, top, right, bottom)?

Note: I didn't compile the library from source. I am using compiled libraries which is distributed in https://github.com/libreliodev/android.

Halil
  • 1,795
  • 1
  • 24
  • 39
  • Is there a answer for this question? Did you find a method to get text by specifying the left,top,right and bottom coordinates? – Naresh Jul 28 '15 at 12:38

2 Answers2

1

yeah sure here is the way you can do.

1.GeneratedText activity

public class GeneratedText extends Activity {

private Button close;
private Button clear;
private TextView tv;
private String data;
String text = "";
Intent i;
Context mContext;
  //    MuPDFPageView pdfview = new MuPDFPageView(mContext, null, null);
    private EditText edit;
private Button undo;
public static GeneratedText screen;


@Override
protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_generated_text);

    close = (Button)findViewById(R.id.close);
    clear = (Button)findViewById(R.id.clear);
    tv = (TextView)findViewById(R.id.text1);
    edit = (EditText)findViewById(R.id.edit);
    undo = (Button)findViewById(R.id.undo);
    undo.setEnabled(false);

    i = getIntent();

    data = i.getStringExtra("data");


    tv.setText(data);
    String mypattern = "Name and address of the Employee \n";

    Pattern p = Pattern.compile(mypattern,Pattern.DOTALL);
    if(data.matches(mypattern))
    {
        System.out.println("Start Printing name");
    }
    else
        //do nothing

    edit.setText(data);
    System.out.println("hello user "+"/n"+"user1"+ "\n"+ "user2");

    SharedPreferences pref = getSharedPreferences("key", 0);
    SharedPreferences.Editor editor = pref.edit();
    editor.putString("text", data);
    editor.commit();


    clear.setOnClickListener(new OnClickListener() {

        @Override
        public void onClick(View v) {
            // TODO Auto-generated method stub
            tv.setText("");
            edit.setText("");
            undo.setEnabled(true);
        }
    });
    close.setOnClickListener(new OnClickListener() {

        @Override
        public void onClick(View v) {
            // TODO Auto-generated method stub
            finish();
        }
    });
    undo.setOnClickListener(new OnClickListener() {

        @Override
        public void onClick(View v) {
            // TODO Auto-generated method stub
             String value = "";
            SharedPreferences pref = getSharedPreferences("key", 0);
            value = pref.getString("text", value);
            edit.setText(value);
            tv.setText(value);
            undo.setEnabled(false); 
        }
    });

}
}

1. now in mupdfactivity write this

public void Showtext( )
{
    destroyAlertWaiter();
    core.stopAlerts();

    MuPDFPageView pdfview = new MuPDFPageView(MuPDFActivity.this, core, null);
    String data = "";
    pdfview.setFocusable(true);
    data = pdfview.getSelectedText();
    Intent i = new Intent(getApplicationContext(),GeneratedText.class);
    i.putExtra("data",data);

    startActivity(i); 

}

call Showtext in OnAcceptButtonClick

and you will get your text.

undur_gongor
  • 15,657
  • 5
  • 63
  • 75
Kumar Bankesh
  • 296
  • 4
  • 27
  • 1
    This example extracts selected text from a pdfView and displays it in GeneratedText activity, right? – Halil Aug 22 '14 at 09:25
0

Yes it is possible to extract text from PDF document with the help of MuPDF library. There is method called text() in mupdf.c which is defined in MuPDFCore.java which returns the text of the page. You need to call that method by page wise. Steps: 1. gotopage(pagenumber) 2. text()

Ganesh Kanna
  • 2,269
  • 1
  • 19
  • 29