Hacker News new | past | comments | ask | show | jobs | submit | jlafon's comments login

Many have commented here on the computational challenges of enumerating, storing, and searching very large virtual libraries. While molecules can be represented and stored as strings, that's an oversimplification of the problem (from a CS perspective). Scientists often want to search these large libraries in 2D and 3D, which requires computing & storing those coordinates. It can be cost prohibitive just to store the 3D coordinates for massive virtual libraries, even for large pharmaceutical companies. We've done this for 10^10 molecules [0]. If you are reading this and find this type of problem interesting, checkout www.eyesopen.com/careers

[0] https://pubs.acs.org/doi/10.1021/acs.jcim.9b00779 "Virtual Screening in the Cloud: How Big Is Big Enough?"


also a molecule doesnt just have a single set of 3d coordinates. You have conformers, solvent effects that affect their conformation as well... Often in a biological system it can be a minor conformer shape that is the strongly binding one (case of taxol). If you want to compute physical properties such as NMR, VCD, ECD you need to optimize your molecules in a completely different way (quantum mechanic calculations of electron shells) than just with a simple mechanical modelling approach to just do docking... we are talking about a 1s calculation per molecule vs a 24h or more for a mid-sized molecule.


OpenEye Scientific | Santa Fe, NM | Onsite/Fulltime | http://www.eyesopen.com/careers

OpenEye Scientific Software provides software to the Pharmaceutical Industry for molecular modeling and cheminformatics. It has done so since 1997 in its continuing mission to provide novel software, new science and better business practices to the industry. Central to our approach is the importance of shape and electrostatics as primary variables of molecular description, platform-independent code for high-throughput 2D and 3D modeling, and a preference for the rigorous rather than the ad hoc.

We have three open positions in Santa Fe, NM. Some of the technologies we use are Python, Go, C++, Django, Docker, and AWS. We are growing our small team and looking for talented backend engineers and DevOps engineers.

Santa Fe is a beautiful mountain town with excellent outdoor recreation (hiking, skiing, cycling) as well as art, music & food culture.


First of all, I'm not defending what is obviously predatory. However, there is more involved then what you might think at first glance. Right or wrong, correctional facilities have reasons to discourage phone calls (context: I put myself through college working at a maximum security prison). Calls are supposed to be monitored (usually done manually) to prevent criminal business from being done on prison phones - and there are never enough people to listen to all calls. There are never enough phones either, which frequently causes tension between inmates using phones and those waiting for them. In higher security levels phones are labor intensive. An officer has to escort a (potentially dangerous) person from their cell to the phone, and stand there for the duration of the call. And to the article's point, it's such a problem that prepaid phone cards are a form of currency on the inside.


> Calls are supposed to be monitored (usually done manually) to prevent criminal business from being done on prison phones - and there are never enough people to listen to all calls.

At $1/minute sorts of rates, it should be possible to pay several people to listen to a call.


Calls are supposed to be monitored (usually done manually) to prevent criminal business [...] In higher security levels phones are labor intensive. An officer has to escort [...] from their cell to the phone, and stand there for the duration of the call.

Sounds like part of the solution is right there, at least in higher security prisons, which is to have the escort also monitor the call. Record all calls regardless and perform spot-audits so that any potential collusion between inmates, outsiders and escorts can be eliminated or minimized. A higher percentage of spot-audits should be carried out on repeat offenders and lower percentage on inmates associated with minor crimes - risk can be assessed with a couple of conditional equations. Advise inmates and staff of monitoring system to prevent collusion from forming in the first place.


They record the phone calls. It seems that if there was an incident it is trivial to review phone calls and bring evidence to bear against the guilty parties.


Or someone like Google could be doing this as a service? Isn't GOOG411 how they captured enough voice data to train their neural nets for Google Now?


This is because the people running the show have created a problem. Let business be done. Let all prisoners know all phone calls are recorded and may be monitored at any time. And record all of them. Then get a company that does this sort of thing to transcribe them all. Then do a search on all the text for keywords that might indicate problems or send the text to India to be read. That's it. The rest, well you know who they are calling, those are potential people to investigate... send those to the NSA or the police.

Or just let illegal business be done. What's the worst that could happen. Lots of criminals outside of bars too.

Also you could let the non-violent ones do whatever and only watch the violent ones. It just depends on whether or not you have a problem solving attitude or a problem creating one. It seems the prison industrial complex creates problems and then spends lots of money to solve them, money of prisoner's families who are already broke and belong to low income households.


>Or just let illegal business be done. What's the worst that could happen. Lots of criminals outside of bars too.

What's the worst that could happen? How about this: inmate arranges a contract on witnesses in his case.


Monitor calls: How can this be an issue when we have the NSA able to snoop on all the worlds' phone calls and mine them for content.

Apple Siri has the ability to translate spoken word to text.

The point is that there is lots of technology that can do a first pass filter on phone calls.

It doesn't need to be manual.

Additionally, this is ignoring the reality that cell phones are smuggled into prisons as well. Any coordinated criminal activity doesn't need to use a land line that is monitored - just use a smuggled cell phone. ( http://fusion.net/story/41931/inside-the-prison-systems-illi... )


> The point is that there is lots of technology that can do a first pass filter on phone calls.

Like most things, it becomes a question of incentives not technology. To the prison it looks like:

You want me to pay lots of money to develop/purchase a system that will vastly reduce the overhead on my very expensive phone system and will probably make me have to charge less?

(And for private prisons)

This will reduce recidivism and therefor the number of repeat 'customers' coming to my facility?


Of all the ways of limiting phone use, is this a good method.

Why not just limit duration?


If you're going to permit phone calls at all, you need to allow enough duration to enable the reasons for allowing phone calls, and that's plenty enough time to orchestrate criminal activity.


Is there any evidence that criminals conduct more criminal activity during their prison sentence than the general populace? If there isn't any, then either you should be monitoring all citizen calls, or get rid of this arbitrary punishment for criminals. If there is such evidence, is the difference substantial enough to warrant the surveillance?


>Is there any evidence that criminals conduct more criminal activity during their prison sentence than the general populace?

The point is rather moot, considering that one of the main purposes of incarceration is incapacitation (effect of a sentence in positively preventing, rather than merely deterring, future offending).

I.e. you cannot steal, rob or kill because being monitored in prison prevents you from doing so. Lack of evidence therefore wouldn't make much of a justification for removing such monitoring because it could be argued just to show that the monitoring works.

However, without digging into statistics I would expect that there is more crime in prisons than outside (not because incapacitation wouldn't work at all, but because offenders are concentrated in prison).

There is substantial violence in prisons between inmates. (Not only in the US but everywhere).


OpenEye Scientific Software (Santa Fe, New Mexico) - various positions: http://www.eyesopen.com/careers


Nice work. I implemented something very similar last year: https://github.com/jlafon/pynamodb.


Awesome, your project and flywheel[0] were the inspiration for a lot of the design[1]!

How do you handle path (Map and List) conditions? I've got an open issue[2] and an idea but I suspect you'd have concrete examples if you've already run into it.

[0]: https://github.com/mathcamp/flywheel [1]: and of course SQLAlchemy, which must always be considered [2]: https://github.com/numberoverzero/bloop/issues/18


It's interesting that Netflix decided to write this rather than using Amazon's own DynamoDB. I wonder specifically if DynamoDB was too expensive (as I have found), or was there some other reason?


This is completely different than DynamoDB. Essentially this is a framework for sharding or horizontally scaling multiple datastores (MySQL, Memcache, Redis, etc).


Why would you use multiple datastores? Isn't it much more difficult to atomically store data this way? Assume a transaction succeeds in store 1, and fails in store 2, you'd need to roll back the transaction in store 1 after it has committed (!)


Because there is no such thing as a one size fits all datasotre. Some data may make way more sense stored in a relational database, while some data makes more sense as key:value pairs. Some data may be infrequently accessed and stored to disk while others are stored in memory for quick access. Trying to make a single datastore work for many use cases will cause more pain than it is worth.


You wouldn't likely use multiple data stores for the same data. That is, you probably wouldn't have a single system which stores data in both Redis and MySQL.

However, you can imagine different use cases for each of these stores and you can potentially increase operational efficiency by having a common layer which handles the distribution of data for each store.


That makes sense. I had only seen the Memcache & Redis datastores being used (and both being comparable to DynamoDB).


This is the sort of project you do to get cross-region replication on your existing data stores, many of which look more like PostgreSQL than KV stores.


Keep in mind that they define

DC := AWS region

Rack := AWS AZ

That should tell you that they need multi-region availability, something that DynamoDB will likely never provide.


Author of the paper here. The file operations are distributed strictly without links, otherwise we could make no guarantees that work wouldn't be duplicated, or even that the algorithm would terminate. We were lucky in that because the parallel file system itself wasn't POSIX, so we didn't have to make our tools POSIX either.


For those interested, here are a couple of differences between this and django-debug-toolbar (an excellent, widely used utility that also provides SQL inspection).

1. django-debug-toolbar can be difficult to enable if you aren't serving from localhost (for security reasons, because your settings are included in the UI). This project looks much simpler to enable, as you just set DEBUG=True in your settings.

2. django-debug-toolbar provides its information through a UI component embedded in your own UIs, but it doesn't help you with profiling HTTP APIs that don't have a UI. In contrast, this project puts some profiling information in the response headers, and SQL logs are sent to a logger. I think this use case is where Django Query Inspector shows a lot of utility.


I don't understand #1. You can have DjDT show up only on a condition, e.g. when someone is a superuser. Isn't it the same as this project?


What I meant is that getting the toolbar to show up when you aren't serving from localhost isn't as trivial as setting DEBUG=True. It requires other settings (https://github.com/django-debug-toolbar/django-debug-toolbar...) which can be annoying to configure.


Checking for a user will break caching though.


"but it doesn't help you with profiling HTTP APIs that don't have a UI"

That's been my only complaint about an otherwise fantastic debug tool.

Anyone have any thoughts on running multiple debug tools like these together? I think mostly you would get comfortable with one and stick with it.


I end up using Django Rest Framework (http://www.django-rest-framework.org/) generated UIs along with the debug toolbar, and then disabling the generated UIs when I'm done profiling.


Also, the browsable api that DRF provides can be used to perform the profiling with django-debug-toolbar.


I'm trying to understand this too. This is only a problem if the OAuth provider (Google, Twitter, etc) does not validate the URL that the client is trying to redirect the user to after the user has authorized the app, correct?


Typically you pre-register a whitelist of redirect URLs with your OAuth provider. For example, you might whitelist example.com/app/* because you control the app and assume that you won't do anything evil. If /app/ includes an open redirect (generally considered to be Severity: Nominal), your application can be made to attack every user who grants it permissions, to the limit of the permissions they entrust your application with.


If you did a line for line transliteration, did you use any of Go's concurrency features such goroutines or channels? I ask because I find that when I translate a program from Python to Go that it's beneficial to structure the program differently so that I can use goroutines.


With Twisted (and with our threadpool worker pools), many patterns translated directly into Goroutine usage in a much cleaner way. Where with Python we were using our own helper libraries, Go's stdlib and the language itself were often more than enough.

We didn't end up using channels too much. Deferreds got translated into Futures that we wrote a small library for. Many of our Go-specific utility classes do use channels heavily though.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: