<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.9.3">Jekyll</generator><link href="https://acuna.io/feed.xml" rel="self" type="application/atom+xml" /><link href="https://acuna.io/" rel="alternate" type="text/html" /><updated>2024-01-28T21:25:54-07:00</updated><id>https://acuna.io/feed.xml</id><title type="html">Daniel Acuña</title><subtitle>Science of science research</subtitle><author><name>Daniel Acuña</name><email>daniel.acuna@colorado.edu</email></author><entry><title type="html">ORI: Methods and tools for scalable figure reuse detection with statistical certainty reporting</title><link href="https://acuna.io/funding/scalable-figure-reuse-detection/" rel="alternate" type="text/html" title="ORI: Methods and tools for scalable figure reuse detection with statistical certainty reporting" /><published>2018-07-01T00:00:00-06:00</published><updated>2018-07-01T00:00:00-06:00</updated><id>https://acuna.io/funding/scalable-figure-reuse-detection</id><content type="html" xml:base="https://acuna.io/funding/scalable-figure-reuse-detection/">&lt;aside class=&quot;sidebar__right&quot;&gt;
&lt;nav class=&quot;toc&quot;&gt;
    &lt;header&gt;&lt;h4 class=&quot;nav__title&quot;&gt;&lt;i class=&quot;fas fa-file-alt&quot;&gt;&lt;/i&gt; On this page&lt;/h4&gt;&lt;/header&gt;
&lt;ul class=&quot;toc__menu&quot; id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#investigators&quot; id=&quot;markdown-toc-investigators&quot;&gt;Investigators&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#abstract&quot; id=&quot;markdown-toc-abstract&quot;&gt;Abstract&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;/nav&gt;
&lt;/aside&gt;

&lt;p&gt;&lt;a href=&quot;#&quot; class=&quot;btn btn--primary&quot;&gt;See in ORI&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;investigators&quot;&gt;Investigators&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/about/&quot;&gt;Daniel E. Acuña&lt;/a&gt; (PI)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;abstract&quot;&gt;Abstract&lt;/h3&gt;

&lt;p&gt;Fraudulent reuse of scientific figures is an increasingly common problem that 
damages the public perception of science. 
The Office of Research Integrity (ORI) reviews whistleblowers’ accusations carefully,
 taking a reactive approach to investigate this type of misconduct. 
 Recently, Acuna, Brookes, and Kording (2018) used machine learning to detect 
 figure reuse in PubMed Open Access articles which share same junior or senior 
 scientist. They estimated that around 0.6% of the papers were very likely fraudulent.
 Current image manipulation investigations are however reactive, 
 not across-authors scalable, originated from whistleblowers, and without 
 statistically-supported verdicts.&lt;/p&gt;

&lt;p&gt;In this project, we propose to dramatically scale automated detection 
of figure reuse across articles and collaborate with ORIs and active 
researchers in the area. We also propose to develop statistical methods 
to support conclusions regarding figure reuses. Once this project is completed, 
we hope that the tools and techniques that we will develop will 
become a standard practice. We expect our research to significantly 
reduce the acceptance of publications with image manipulation and 
therefore significantly reduce the incidence of one of the most 
damaging instances of scientific misconduct.&lt;/p&gt;</content><author><name>Daniel Acuña</name><email>daniel.acuna@colorado.edu</email></author><category term="Funding" /><summary type="html">Scalable methods for figure reuse detection</summary></entry><entry><title type="html">Acuna lab is looking for students to optimize science using machine learning (new Fall 2019)</title><link href="https://acuna.io/blog/acuna-lab-is-looking-for-students/" rel="alternate" type="text/html" title="Acuna lab is looking for students to optimize science using machine learning (new Fall 2019)" /><published>2018-06-25T00:00:00-06:00</published><updated>2018-06-25T00:00:00-06:00</updated><id>https://acuna.io/blog/acuna-lab-is-looking-for-students</id><content type="html" xml:base="https://acuna.io/blog/acuna-lab-is-looking-for-students/">&lt;h2 id=&quot;about-the-lab&quot;&gt;About the lab&lt;/h2&gt;

&lt;p&gt;Dr. Acuna is an Assistant Professor in the School of Information Studies 
at Syracuse University. He currently
works on mathematical and computational models of scientific discovery, predictability,
and integrity. Please take a moment to look
at his &lt;a href=&quot;/about/&quot;&gt;background&lt;/a&gt;, &lt;a href=&quot;/research/&quot;&gt;research&lt;/a&gt;, and &lt;a href=&quot;/funding/&quot;&gt;recent grants&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Professor Acuna teaches courses for the 
Applied Data Science 
and Information Management graduate degrees. He is currently the teacher and Professor of Record for 
the course IST 718: Big Data Analytics.&lt;/p&gt;

&lt;p&gt;Past Master’s students have done internships in Silicon Valley (e.g., Airbnb, Google), 
are working in major consulting companies (e.g., Ernst &amp;amp; Young,  Goldman Sachs), and are 
broadly working as data scientists. Please see the &lt;a href=&quot;/people/&quot;&gt;People section&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;about-the-position&quot;&gt;About the position&lt;/h2&gt;

&lt;p&gt;Assistant Professor Daniel Acuna from the School of Information Studies (&lt;a href=&quot;https://acuna.io&quot;&gt;https://acuna.io&lt;/a&gt;), leader of the newly-formed Science of Science and Computational Discovery (SOS+CD) Lab, is looking for Master’s students to work on quantitative analysis of big data. Broadly speaking, the SOS+CD Lab works on understanding how science works and semi-automatically generating scientific discoveries from vast, unstructured dataset of full-text publications, citations, and images. The SOS+CD Lab uses a variety of computational techniques including deep learning, natural language processing, graph analytics, image processing and causal inference. The ideal candidate should have an undergraduate major in Computer Science, Engineering, Applied Statistics, Mathematics, or a similar quantitative field.&lt;/p&gt;

&lt;h3 id=&quot;requirements&quot;&gt;Requirements&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Develop reproducible software and tools to optimally match reviewers and manuscripts based on
 mathematical objective functions&lt;/li&gt;
  &lt;li&gt;Write method and result sections for scientific manuscripts&lt;/li&gt;
  &lt;li&gt;Have advanced computer programming skills in languages such as Python and R. SQL is also
desirable&lt;/li&gt;
  &lt;li&gt;Understand linear algebra, calculus, probability and statistics&lt;/li&gt;
  &lt;li&gt;Understand machine learning software tools and pipelines in &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;scikit-learn&lt;/code&gt;, &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;R&lt;/code&gt;, or &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;Spark ML&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;Understand basic concepts of software engineering&lt;/li&gt;
  &lt;li&gt;Have good communication skills&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;qualifications&quot;&gt;Qualifications&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Undergraduate (for MS students) or graduate degree in Computer Science, Engineering, 
Applied Statistics, Applied Mathematics, or similar quantitative fields&lt;/li&gt;
  &lt;li&gt;Minimum of 2 years of experience with coding in a major programming language such as 
Python, R, C, C++, or Java. Experience with handling big data with Apache Spark is a plus.&lt;/li&gt;
  &lt;li&gt;Demonstrable knowledge of linear algebra, calculus, probability, and statistics&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;apply&quot;&gt;Apply&lt;/h2&gt;

&lt;p&gt;Otherwise, send an email to &lt;code class=&quot;language-plaintext highlighter-rouge&quot;&gt;deacuna AT syr DOT edu&lt;/code&gt; and include:&lt;/p&gt;
&lt;ol&gt;
  &lt;li&gt;A short introduction of yourself and why you want to work with me&lt;/li&gt;
  &lt;li&gt;A short CV or a 1-page resume&lt;/li&gt;
  &lt;li&gt;Your Github repository, preferably with code from a personal project rather than a “class project”.&lt;/li&gt;
  &lt;li&gt;Your transcripts&lt;/li&gt;
  &lt;li&gt;Your GRE, GMAT, or equivalent scores&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href=&quot;https://fa.ischool.syr.edu/apply/cc429ca9-41aa-4bde-b040-40858de5f256/&quot; class=&quot;btn btn--success btn--large&quot;&gt;Apply&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you have any questions, do not hesitate in contacting me. 
If you are thinking of applying to the Ph.D. program, we have a very competitive &lt;strong&gt;fully-funded program&lt;/strong&gt;, and you
should contact me first. Otherwise, 
&lt;a href=&quot;https://ischool.syr.edu/admissions/checklists/phd-checklist/&quot;&gt;apply to the Ph.D. program&lt;/a&gt; 
and mention my name in you materials.&lt;/p&gt;

&lt;p&gt;Part of the funding for these positions has been generously provided by the National Science Foundation awards 
&lt;a href=&quot;/funding/grant-on-improving-scientific-innovation/&quot;&gt;#1646763&lt;/a&gt; and &lt;a href=&quot;/funding/optimizing-scientific-peer-review/&quot;&gt;#1800956&lt;/a&gt;&lt;/p&gt;</content><author><name>Daniel Acuña</name><email>daniel.acuna@colorado.edu</email></author><category term="Blog" /><summary type="html">Looking for Ph.D. and Master's students to work on exciting applications of artificial intelligence and pattern recognition to automate science.</summary></entry><entry><title type="html">NSF: Optimizing scientific peer review</title><link href="https://acuna.io/funding/optimizing-scientific-peer-review/" rel="alternate" type="text/html" title="NSF: Optimizing scientific peer review" /><published>2018-06-22T00:00:00-06:00</published><updated>2018-06-22T00:00:00-06:00</updated><id>https://acuna.io/funding/optimizing-scientific-peer-review</id><content type="html" xml:base="https://acuna.io/funding/optimizing-scientific-peer-review/">&lt;aside class=&quot;sidebar__right&quot;&gt;
&lt;nav class=&quot;toc&quot;&gt;
    &lt;header&gt;&lt;h4 class=&quot;nav__title&quot;&gt;&lt;i class=&quot;fas fa-file-alt&quot;&gt;&lt;/i&gt; On this page&lt;/h4&gt;&lt;/header&gt;
&lt;ul class=&quot;toc__menu&quot; id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#investigators&quot; id=&quot;markdown-toc-investigators&quot;&gt;Investigators&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#abstract&quot; id=&quot;markdown-toc-abstract&quot;&gt;Abstract&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;/nav&gt;
&lt;/aside&gt;

&lt;p&gt;&lt;a href=&quot;https://www.nsf.gov/awardsearch/showAward?AWD_ID=1800956&quot; class=&quot;btn btn--primary&quot;&gt;See in NSF&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;investigators&quot;&gt;Investigators&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;/about/&quot;&gt;Daniel E. Acuña&lt;/a&gt; (PI)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://www.knowledgelab.org/people/detail/james_a_evans/&quot;&gt;James Evans&lt;/a&gt; (Co-PI)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://kordinglab.com/people/konrad_kording/index.html&quot;&gt;Konrad Körding&lt;/a&gt; (Co-PI)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;abstract&quot;&gt;Abstract&lt;/h3&gt;

&lt;p&gt;Scientific peer review is a central process when deciding who gets published, promoted, or awarded a prize or grant. Consequently, it may have tremendous impact on the career of scientists and the direction of science. Several researchers, however, have shown that scientific peer review can be slow and low-quality. Moreover, some studies have quantified peer review biases - e.g., prejudices against certain ideas - and inconsistencies - e.g., the same work receiving widely different opinions from different groups of peers. These problems delay or sometimes truncate the dissemination of important research, affecting technological development and ultimately the economy. This project analyzes factors that affect the outcomes of peer review, uses these to improve reviewer selection, develops software that optimizes reviewer assignments, and evaluates the resulting models in the real-world context of a scientific journal, major scientific conferences, and massive open, online courses (MOOCs). By the end of this project, the scientific community will have a better understanding of the factors that affect peer review and actionable insights to make peer review better.&lt;/p&gt;

&lt;p&gt;The first component of this project quantifies problems in bias, variance, timing, and quality of reviews. This includes direct effects (e.g., do they collaborate or cite one another) and indirect effects (e.g., do they contribute to and hopefully self-identify with the same community). The project also identifies bias as a function of personal characteristics of author and reviewer. These aspects include age, gender, and minority status, and their visibility and centrality within the field. The same general approach is used to predict the timing of reviews, including the choice to accept the review task. Lastly, the research uses this feature set to predict the quality of reviews. The result, for a given manuscript, includes prediction for each possible reviewer’s biases and decision variance, likelihood and timing to participate in the review process, and ultimate review quality. The second component of this project researches and develops techniques to estimate the characteristics of potential reviewers and uses those inferred characteristics to propose, for any given manuscript, a review panel. The techniques optimize the expected value for a cost function that balances the three objectives of reviewer choice variance (bias and covariance), review timing, and review quality. Presumably, this involves suggesting panels comprised of reviewers with complementary expertise and potentially career stage, who understand the topic and are interested in the manuscripts contents. The project allows the option of making these recommendations conditional on the background, characteristics and position of the editor under consideration. Lastly, the project tests the techniques that automatically assign reviewers and analyzes the output of the process in real world applications. In particular, the project collaborates with a large journal, scientific conferences, and massive open, online course (MOOC) organizations. Through random assignments (current methods versus the project’s algorithm), the project evaluates the degree to which the assignment approach produces less reviewer choice variance, faster reviews, and reviews of higher quality. The project creates software and results that can be used by other venues.&lt;/p&gt;</content><author><name>Daniel Acuña</name><email>daniel.acuna@colorado.edu</email></author><category term="Funding" /><summary type="html">This project analyzes factors that affect the outcomes of peer review, uses these to improve reviewer selection, develops software that optimizes reviewer assignments, and evaluates the resulting models in the real-world context of a scientific journal, major scientific conferences, and massive open, online courses (MOOCs)</summary></entry><entry><title type="html">Bioscience-scale automated detection of figure element reuse</title><link href="https://acuna.io/research/bioscience-scale-automated-detection-of-figure-reuse/" rel="alternate" type="text/html" title="Bioscience-scale automated detection of figure element reuse" /><published>2018-02-23T00:00:00-07:00</published><updated>2018-02-23T00:00:00-07:00</updated><id>https://acuna.io/research/bioscience-scale-automated-detection-of-figure-reuse</id><content type="html" xml:base="https://acuna.io/research/bioscience-scale-automated-detection-of-figure-reuse/">&lt;p&gt;&lt;a href=&quot;http://acuna.io&quot;&gt;Daniel E Acuna&lt;/a&gt;, School of Information Studies, Syracuse University&lt;br /&gt;
&lt;a href=&quot;https://www.urmc.rochester.edu/people/23781238-paul-spencer-brookes&quot;&gt;Paul S Brookes&lt;/a&gt;, University of Rochester Medical Center&lt;br /&gt;
&lt;a href=&quot;http://kordinglab.com/&quot;&gt;Konrad P Kording&lt;/a&gt;, University of Pennsylvania&lt;/p&gt;

&lt;hr /&gt;
&lt;p&gt;&lt;a href=&quot;https://doi.org/10.1101/269415&quot; class=&quot;btn btn--primary&quot;&gt;BioRXiv&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Abstract&lt;/strong&gt;&lt;br /&gt;
Scientists reuse figure elements sometimes appropriately, e.g. when comparing methods, and sometimes inappropriately, e.g. when presenting an old experiment as a new control. To understand such reuse, automatically detecting it would be important. Here we present an analysis of figure element reuse on a large dataset comprising 760 thousand open access articles and 2 million figures. Our algorithm detects figure region reuse, while being robust to rotation, cropping, resizing, and contrast changes, and estimates which of the reuses have biological meaning. Then a three-person panel analyzes how problematic these biological reuses are using contextual information such as captions and full texts. Based on the panel reviews, we estimate that 9% of the biological reuses would be unanimously perceived as at least suspicious. We further estimate that 0.6% of all articles would be unanimously perceived as fraudulent, with inappropriate reuses occurring 43% across articles, 28% within article, and 29% within a figure. Our tool rapidly detects image reuse at scale, promising to be useful to a broad range of people that campaign for scientific integrity. We suggest that a great deal of scientific fraud will be, sooner or later, detectable by automatic methods.&lt;/p&gt;</content><author><name>Daniel Acuña</name><email>daniel.acuna@colorado.edu</email></author><category term="Research" /><summary type="html">Daniel E Acuna, School of Information Studies, Syracuse University Paul S Brookes, University of Rochester Medical Center Konrad P Kording, University of Pennsylvania</summary></entry><entry><title type="html">IST 718: Big Data Analytics</title><link href="https://acuna.io/teaching/IST718/" rel="alternate" type="text/html" title="IST 718: Big Data Analytics" /><published>2018-02-18T00:00:00-07:00</published><updated>2018-06-28T00:00:00-06:00</updated><id>https://acuna.io/teaching/IST718</id><content type="html" xml:base="https://acuna.io/teaching/IST718/">&lt;p class=&quot;notice--danger&quot;&gt;&lt;strong&gt;This is an advanced course&lt;/strong&gt;: There seem to be no official pre-requisites
 in the Syracuse University’s catalog system for taking this class. 
Most students have already taken &lt;em&gt;IST 687 - Introduction to Data Science&lt;/em&gt;, 
which is a nice introduction to the field. &lt;strong&gt;However, students will be
expected to know programming in Python or R and have
some background in linear algebra, calculus, probability, and statistics as well&lt;/strong&gt;. This means
that even if you register for the class, you might not have the necessary
background to fully take advantage of what this class has to offer.&lt;br /&gt;
If you are in doubt, take the following test, which you should be able to solve relatively
easily&lt;br /&gt;
&lt;a href=&quot;/assets/pdf/preliminary_test_ist718.pdf&quot; class=&quot;btn btn--inverse&quot;&gt;&lt;i class=&quot;far fa-file-pdf&quot;&gt;&lt;/i&gt; Preliminary test&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the past, I have suggested students go through the following courses
to grasp the basic math required to be a good data scientist:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Linear algebra&lt;/strong&gt;: This &lt;a href=&quot;https://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm&quot;&gt;MIT OCW’s Linear Algebra&lt;/a&gt; course, which is free  The first couple of lectures cover most you need&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Calculus&lt;/strong&gt;: Another &lt;a href=&quot;https://ocw.mit.edu/courses/mathematics/18-01sc-single-variable-calculus-fall-2010/unit-2-applications-of-differentiation/&quot;&gt;MIT OCW’s Calculus&lt;/a&gt; free course. I would recommend Part A and B for IST 718.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Probability and statistics&lt;/strong&gt; I would recommend the first chapters of DeGroot and Schervish’s book “Probability and Statistics”&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Programming&lt;/strong&gt;: There are plenty of resources online about programming. For programming in Python, I would recommend &lt;a href=&quot;https://jakevdp.github.io/PythonDataScienceHandbook/&quot;&gt;Jake VanderPlas’s “Python Data Science Handbook”&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;goal&quot;&gt;Goal&lt;/h3&gt;

&lt;p&gt;This course is a broad introduction to modern techniques in data science including elastic net regularized regression, random forest, gradient boosting, and deep learning. It emphasizes a statistical learning point of view, and a careful examination of generalization error, model interpretability, feature engineering, and bias-variance tradeoff.&lt;/p&gt;

&lt;h3 id=&quot;tools&quot;&gt;Tools&lt;/h3&gt;

&lt;p&gt;The tool of choice is Apache Spark on Hadoop’s HDFS. The environment we use is Databricks Community Edition, which runs a highly customized version of the Jupyter Notebook.&lt;/p&gt;

&lt;h3 id=&quot;prerequisites&quot;&gt;Prerequisites&lt;/h3&gt;

&lt;p&gt;The pre-requistes for this course are a basic knowledge of discrete mathematics, calculus, probability, and Python.&lt;/p&gt;

&lt;p&gt;We use the following books:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/wesm/pydata-book&quot;&gt;Python for Data Analysis (PFDA)&lt;/a&gt;, 2nd Edition&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Sixth%20Printing.pdf&quot;&gt;An introduction to Statistical Learning with Applications in R (ISLR)&lt;/a&gt; by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://go.databricks.com/definitive-guide-apache-spark&quot;&gt;Spark: The Definitive Guide (STDG), Upcoming (expected 2018)&lt;/a&gt; by B. Chambers and M. Zaharia,&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://www.deeplearningbook.org/&quot;&gt;Deep Learning (DL)&lt;/a&gt; by Ian Goodfellow, Yoshua Bengio, and Aaron Courville&lt;/li&gt;
&lt;/ol&gt;

&lt;h3 id=&quot;syllabus&quot;&gt;Syllabus&lt;/h3&gt;

&lt;iframe src=&quot;https://drive.google.com/file/d/1q-ibQUpdTxZQOUAU4bhn-F1BuXS3BXvS/preview&quot; width=&quot;640&quot; height=&quot;480&quot;&gt;&lt;/iframe&gt;</content><author><name>Daniel Acuña</name><email>daniel.acuna@colorado.edu</email></author><category term="Teaching" /><summary type="html">This is an advanced course: There seem to be no official pre-requisites in the Syracuse University’s catalog system for taking this class. Most students have already taken IST 687 - Introduction to Data Science, which is a nice introduction to the field. However, students will be expected to know programming in Python or R and have some background in linear algebra, calculus, probability, and statistics as well. This means that even if you register for the class, you might not have the necessary background to fully take advantage of what this class has to offer. If you are in doubt, take the following test, which you should be able to solve relatively easily Preliminary test</summary></entry><entry><title type="html">NSF: EAGER: Improving scientific innovation by linking funding and scholarly literature</title><link href="https://acuna.io/funding/grant-on-improving-scientific-innovation/" rel="alternate" type="text/html" title="NSF: EAGER: Improving scientific innovation by linking funding and scholarly literature" /><published>2016-09-01T00:00:00-06:00</published><updated>2018-06-28T00:00:00-06:00</updated><id>https://acuna.io/funding/grant-on-improving-scientific-innovation</id><content type="html" xml:base="https://acuna.io/funding/grant-on-improving-scientific-innovation/">&lt;aside class=&quot;sidebar__right&quot;&gt;
&lt;nav class=&quot;toc&quot;&gt;
    &lt;header&gt;&lt;h4 class=&quot;nav__title&quot;&gt;&lt;i class=&quot;fas fa-file-alt&quot;&gt;&lt;/i&gt; On this page&lt;/h4&gt;&lt;/header&gt;
&lt;ul class=&quot;toc__menu&quot; id=&quot;markdown-toc&quot;&gt;
  &lt;li&gt;&lt;a href=&quot;#investigator&quot; id=&quot;markdown-toc-investigator&quot;&gt;Investigator:&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#abstract&quot; id=&quot;markdown-toc-abstract&quot;&gt;Abstract&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#articles-7&quot; id=&quot;markdown-toc-articles-7&quot;&gt;Articles (7)&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#web-service-and-software&quot; id=&quot;markdown-toc-web-service-and-software&quot;&gt;Web service and software&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

  &lt;/nav&gt;
&lt;/aside&gt;

&lt;p&gt;&lt;a href=&quot;https://www.nsf.gov/awardsearch/showAward?AWD_ID=1646763&quot; class=&quot;btn btn--primary&quot;&gt;See in NSF&lt;/a&gt;&lt;/p&gt;

&lt;h3 id=&quot;investigator&quot;&gt;Investigator:&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;Daniel E. Acuña&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;abstract&quot;&gt;Abstract&lt;/h3&gt;

&lt;p&gt;This project identifies scientists and organizations and their topical interests, enabling the tracking of past productivity and impact. By linking scholarly literature and grants, this project creates a unified dataset that captures diverse scientific disciplines and federal grant award types. A web-based levels the playing field for scientists lacking knowledge about research and funding programs. Users are expected to spend less time searching the literature and more time evaluating significance and impact.&lt;/p&gt;

&lt;p&gt;This project consolidates disparate repositories of publications and grants, disambiguates and enriches information about scientists and organizations, and builds a web-based tool to help navigate this information. This project solves many of these issues by modeling the relationship approximately 2.6 million grants from the Federal RePORTER, and a consolidated, multi-source dataset of millions of articles from Microsoft Academic Graph (83 M), MEDLINE (25 M), PubMed Open Access Subset (1 M), ArXiv (0.6 M), and the National Bureau of Economic Research [NBER] (14K). The project creates a web-based tool that generates instantaneous reports about publications, grants, scientists, and organizations related to users’ interests. The unified dataset and web tool could revolutionize how Program Officers evaluate proposals and how researchers find fundable ideas, making science faster, more accurate, and less biased.&lt;/p&gt;

&lt;h3 id=&quot;articles-7&quot;&gt;Articles (7)&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Achakulvisut, T, &lt;strong&gt;Acuna, DE&lt;/strong&gt;, Bassett. DS, Kording, KP, &lt;em&gt;Unique subfields of neuroscience 
exhibit more diverse language&lt;/em&gt; &lt;a href=&quot;https://github.com/titipata/language-variability-neuro/blob/master/manuscript/unique_subfields_achakulvisut.pdf&quot;&gt;Link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Líenard, JF, Achakulvisut, T, &lt;strong&gt;Acuna, DE&lt;/strong&gt;, David, SV, &lt;em&gt;Intellectual Synthesis in Mentorship Determines Success in Academic Careers&lt;/em&gt; &lt;a href=&quot;https://doi.org/10.1101/273888&quot;&gt;Link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Harandi, M, &lt;strong&gt;Acuna, DE&lt;/strong&gt;, &lt;em&gt;Differences in productivity patterns for junior and senior NSF grantees&lt;/em&gt;&lt;/li&gt;
  &lt;li&gt;Teplitskiy, M, &lt;strong&gt;Acuna, DE&lt;/strong&gt;, Elamrani-Raoult, A, Körding, K, Evans, J &lt;em&gt;The Social Structure of Consensus in Scientific Review&lt;/em&gt; &lt;a href=&quot;https://arxiv.org/pdf/1802.01270.pdf&quot;&gt;Link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Acuna, DE&lt;/strong&gt;, Brooks, P, Kording, P (2018) &lt;em&gt;Bioscience-scale automated detection of figure element reuse&lt;/em&gt; (2018) BioArXiv, &lt;a href=&quot;https://arxiv.org/pdf/1802.01270.pdf&quot;&gt;Link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Shema, A, &lt;strong&gt;Acuna, DE&lt;/strong&gt; (2017) &lt;em&gt;Show Me Your App Usage and I Will Tell Who Your Close Friends Are: Predicting User’s Context from Simple Cellphone Activity&lt;/em&gt;, CHI 2017, Pages 2929-2935, Denver, Colorado &lt;a href=&quot;https://dl.acm.org/citation.cfm?id=3053275&quot;&gt;Link&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Achakulvisut T, &lt;strong&gt;Acuna DE&lt;/strong&gt;, Ruangrong T, Kording K (2016) &lt;em&gt;Science Concierge: A Fast Content-Based Recommendation System for Scientific Publications&lt;/em&gt;. PLoS ONE 11(7): e0158423. doi:10.1371/journal.pone.0158423 &lt;a href=&quot;http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0158423&quot;&gt;Link&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;web-service-and-software&quot;&gt;Web service and software&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://eileen.io&quot;&gt;eileen.io&lt;/a&gt;
    &lt;ul&gt;
      &lt;li&gt;&lt;a href=&quot;https://github.com/sciosci/nsf_data_ingestion&quot;&gt;Data ingestion pipeline&lt;/a&gt;&lt;/li&gt;
      &lt;li&gt;Front end (&lt;em&gt;soon&lt;/em&gt;)&lt;/li&gt;
      &lt;li&gt;Back end (&lt;em&gt;soon&lt;/em&gt;)&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;</content><author><name>Daniel Acuña</name><email>daniel.acuna@colorado.edu</email></author><category term="Funding" /><summary type="html">This project identifies scientists and organizations and their topical interests, enabling the tracking of past productivity and impact.</summary></entry></feed>