I'm happy to announce that boccob the #greytalk bot now has the ability to search the text of Dragon Magazine and Strategic Review.
Every issue will be searched and you will receive your results via private message.
This project isn't complete yet, there may be bugs. Let me know here if you spot a problem or have a suggestion for the future.
Use:
!bsearch <search terms>
- quotes or other punctuation are not needed - but there are definitely cases where if a search turns up nothing, you should consider trying an appropriate period or comma (I hope to improve this over time)
- wildcards, regular expression, and booleans not currently supported
!recent <number>
- A peek at up to 100 (the default is 10) of the most recent searches performed
Notes:
- This command provides NO access to the underlying material, only the issue and page numbers where the search terms are found are returned
- Due to limitations and/or corruption in my available data sources:
o Some issues may never be available.
* Currently the following issues are not indexed: 147-149
o Some issues will have ads and other non-article text exposed.
o I've spent many many hours verifying the accuracy of the text data, but I have not and will never undertake a truly exhaustive review - inaccuracies and mangled text may exist. This may impact the accuracy of your searches.
o Columnar text has not always been extracted correctly, this may cause false positives or negatives when searching for specific phrases in SOME issues.
- Due to inconsistent standards, there are likely to be off-by-one errors where the page numbers returned for a given issue are incorrect in terms of the true page count*
- * Dragon Magazine has had at times, some really insane page numbering schemes if one goes by the page number shown on the physical page. I am only able to index as if counting from the first physical page, incrementing with each new page. I can't do a thing about the weird skips Dragon became fond of over time.
- Search terms less than 4 characters long are disallowed - this number may increase depending on use
- In the face of abuse, financial considerations, or entropy, this tool may disappear at any time without warning; if you value it, let me know.
Request for Assistance:
If you discover a search did not return a result you know for certain should have been returned (ie. you can find the search term in a given issue/page number, and didn't receive it in the results from boccob), please reply here with details so I can correct the text.
As an example: During a test run I searched for "Tasha's Hideous Laughter" in Dragon #338, only to receive no results. However I know that the phrase occurs on page 35 (according to the issue's page numbering).
All I need from anyone willing to provide help here is a message such as:
"Tasha's Hideous Laughter", Dragon 338, page 35
Thanks!
TODO:
- Find better copies of issues 147-149, which currently resist every attempt to repair them
- Allow users to restrict the search space
- Support additional search options
- Maybe add additional data sources... open to suggestions (RPGA? something else?)
- Seek cleaner sources or clean stuff up myself
This will be a very helpful tool to use, lamashtu. Thank you!
Is this only for use on GreyTalk?
The problem you experienced searching for, "Tasha's Hideous Laugher" may be explained by the fact that you left out the 't' in 'laughter' both times you typed it.
Sadly though, there was indeed an OCR error during that test.
This is indeed for greytalk only - and I doubt I'll ever pursue another home for it. Actual participation in chat not required of course - all are welcome to pop in, run some searches, and disappear again.
Also, for those who were present for my earlier testing, searches no longer take 8-10 minutes. Instead they're averaging around 90 milliseconds currently.
lamashtu: Are you familiar with the Dragondex @ http://www.aeolia.net/dragondex/ and if so, are you able to import it as the baseline text index for your boccob search?
grodog: Yes, and yes - but I'm not sure there would be any benefit to try to tie the two together.
I could provide relevant hits from Dragondex before/after the other results. Or I could create a separate command allowing users to query the Dragondex alone.
My inspiration to take on this project was a recent question where someone was looking for an article in Dragon, and had a number of key words they associated with it. I thought people might find it useful to have a means to try to discover or re-discover such things.
I'm certainly open to ideas that would make the tool more useful. Can you explain what you're envisioning?
All issues of Dragon except 147, 148, and 149 are now indexed.
If I ever manage to get copies of those last three issues that I can work with, I will add them. Otherwise, as there's an overall lack of interest, development has ceased.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
Canonfire! is a production of the Thursday Group in assocation with GREYtalk and Canonfire! Enterprises