Contribution: purpose-mechanism schema
purpose-mechanism schema is feasible target for scaling up
Purpose mechanism can be annotated feasibly, even for highly technical papers
We have working classifiers that do a fairly good job (F = ~.6ish?) of tagging purpose segments / spans in scientific texts
Started with predicting overall soft vector
Then did SpanRel + some other magic.
I'm a bit nervous about these numbers because we're still using the CSCW 50 papers from @chanSOLVENTMixedInitiative2018 as our test set. This is a low-hanging fruit thing that is worth exploring further, and we'll need much more data for this
can reach out ot Kenneth Huang who has ~11k COVID-19 papers annotated
caveat is that purpose accuracy is... not great - F of about .6
not sure how this intersects with genre stuff though... important open question and threat for our approach. we're making a major assumption that enough of what we need is in abstracts. is that true?
could be nuanced: abstract could be enough for purpose
but other stuff we need to get from
i think there is still some room here for improvement
wondering about integration with sys/Semantic Scholar??
coudl also get some real-time data on matches? not sure though - analytics on usage rate will be tough to get - real signal is whether it gets added to your library, which mendeley has
purpose-mechanism schema is useful for supporting analogical search