Search tool for pdf file contents

Got something cool to share with AppGini users? Feel free to post it here!
Post Reply
Melroy
Posts: 22
Joined: 2015-06-10 12:59

Search tool for pdf file contents

Post by Melroy » 2016-03-07 21:33

Hi,
I am using version 5.5

My website database is for an History Society Archive Project shared by a number of Archivists who digitise and record photos and documents (pdf). I would like to allow certain parts of the database to be viewed by our members and it would be very useful if they could search the image file which stores both images and pdf files for the contents of files having readable text. (All scanned documents have been through an OCR program before saving as a pdf.) Records found with the requested search term would return only those records containing the text, either as at present, or on a separate page with links to the file. I currently use FreeFind to carry this out on an html website but this does not work for a dynamic website. Is there a simple solution?

Melroy
Posts: 22
Joined: 2015-06-10 12:59

Re: Search tool for pdf file contents

Post by Melroy » 2016-03-10 10:28

The new plugin search tool is a great addition but does not solve the problem of searching for text within a file (e.g. pdf, docx). I may get the plugin anyway but would like to know if what I am asking is possible and if Appgini intend incorporating this feature in the future.

eagle
Veteran Member
Posts: 39
Joined: 2013-01-09 15:38

Re: Search tool for pdf file contents

Post by eagle » 2016-03-13 10:31

Perhaps you could solve it by adding this library:

http://www.pdfparser.org/

You could create a couple of new fields for each file - for example date_indexed and fulltext_extracted

Then you could create a nightly php script to traverse all files in your database where date_indexed = NULL and store the extracted text inn fulltext_extracted. Update date_indexed for the newly indexed files.

Melroy
Posts: 22
Joined: 2015-06-10 12:59

Re: Search tool for pdf file contents

Post by Melroy » 2016-03-13 15:22

Hi Eagle,
Thanks for your reply to my query. Since posting I have found and purchased ($8) a piece of code called 'PHP Search Engine' from CodeCanyon which I hope will do the trick. I will also look at your suggestion which sounds good. I have very little coding experience so it might take a while to sus it all out. Thanks again.

Post Reply