Open and Sustainable Innovation Systems (OASIS) Lab working notes

Powered by 🌱Roam Garden

Q: How can we augment scientific creativity with computational analogy?

Tags: key question for P/Computational Analogy

Description {{word-count}}

Analogy is defined by relational similarity

purpose-mechanism schema is feasible target for scaling up

Purpose mechanism can be annotated feasibly, even for highly technical papers

We have working classifiers that do a fairly good job (F = ~.6ish?) of tagging purpose segments / spans in scientific texts

Started with predicting overall soft vector

Then did bi-LSTM

Then did SpanRel + some other magic.

This is currently our best-performing model (from our unpublished manuscript)

)

I'm a bit nervous about these numbers because we're still using the CSCW 50 papers from @chanSOLVENTMixedInitiative2018 as our test set. This is a low-hanging fruit thing that is worth exploring further, and we'll need much more data for this

can reach out ot Kenneth Huang who has ~11k COVID-19 papers annotated

caveat is that purpose accuracy is... not great - F of about .6

also shows later in @huangCODA19UsingNonExpert2020 that sys/SciBERT actually does a really good job, about .7ish as well

some nice comments in a submission here:

not sure how this intersects with genre stuff though... important open question and threat for our approach. we're making a major assumption that enough of what we need is in abstracts. is that true?

could be nuanced: abstract could be enough for purpose

but other stuff we need to get from

full-text

citation statements

etc

Tom Hope has also explored using graph-based methods to augment the attention mechanisms

And we've also explored the utility and feasibility of sys/Snorkel-like weak supervision to do a cue-based thing.

IIRC, works better for purpose than mechanism

i think there is still some room here for improvement

wondering about integration with sys/Semantic Scholar??

coudl also get some real-time data on matches? not sure though - analytics on usage rate will be tough to get - real signal is whether it gets added to your library, which mendeley has

so maybe integration with sys/Mendeley is the thing to explore.

purpose-mechanism schema is useful for supporting analogical search

Can find matches

And also science papers @chanSOLVENTMixedInitiative2018

New stuff from Hyeonsu, with case study data

Interestingly, even when the purpose span predictions aren't super great, F-wise, we do ok

Saw this with our explorations of sys/Snorkel too

Q: How can we augment scientific creativity with computational analogy?