Hacker News new | past | comments | ask | show | jobs | submit login

Neat. Many questions. I assume you have some database of path matches for the theme/plugin js/CSS files or similar?

How extensive is the database? Just free themes/plugins from one source, or popular themes from several sources? Paid ones too?

Any namespace clashes where you have to dig deeper to tell which theme or plugin it is?

Were you able to fully automate the creation and updating of the signature database?




We actually don't have a signature database, not needed.

WordPress have a predictable path structure and we use that to extract theme and plugin slugs (textual ids). For some plugins that don't import JS or StyleSheets we look for other signatures.

Once we have the slugs, we do a lookup in the official WordPress theme/plugin repository and get all the info we need (plugin descriptions, icon, author etc)


Looks like it might be using some pattern matching that could be tightened up a bit.

For example:

http://wpdetective.io/wordpress.org

That trips it up a bit with false positives, finding plugins that aren't plugins.


When pattern matching goes wrong... Good find, we will fix this, thanks! :)


It has the look of using regex against raw page html.

I would guess you'd have better luck parsing the html and extracting the href attributes of any <link> tags, src attributes of <script> tags, etc. Then pattern matching only against that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: