Bug reports on any software

116 readers

3 users here now

When a bug tracker is inside the exclusive walled-gardens of MS Github or Gitlab.com, and you cannot or will not enter, where do you file your bug report? Here, of course. This is a refuge where you can report bugs that are otherwise unreportable due to technical or ethical constraints.

⚠of course there are no guarantees it will be seen by anyone relevant. Hopefully some kind souls will volunteer to proxy the reports.

founded 3 years ago

MODERATORS

[email protected]

grep/pdfgrep’s inability to match across lines (sopuli.xyz)

submitted 5 months ago* (last edited 5 months ago) by [email protected] to c/[email protected]

0 comments fedilink hide all child comments

Some will regard this as an enhancement request. To each his own, but IMO *grep has always had a huge deficiency when processing natural languages due to line breaks. PDFGREP especially because most PDF docs carry a payload of natural language.

If I need to search for “the.orange.menace“ (dots are 1-char wildcards), of course I want to be told of cases like this:

A court whereby no one is above the law found the orange  
menace guilty on 34 counts of fraud..

When processing a natural language a sentence terminator is almost always a more sensible boundary. There’s probably no command older than grep that’s still in use today. So it’s bizarre that it has not evolved much. In the 90s there was a Lexis Nexus search tool which was far superior for natural language queries. E.g. (IIRC):

foo w/s bar :: matches if “foo” appears within the same sentence as “bar”
foo w/4 bar :: matches if “foo” appears within four words of “bar”
foo pre/5 bar :: matches if “foo” appears before “bar”, within five words
foo w/p bar :: matches if “foo” appears within the same paragraph as “bar”

Newlines as record separators are probably sensible for all things other than natural language. But for natural language grep is a hack.

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here