HOME  |  PAPERS |  NOTES  |  BLOG |  DATA |  SOFTWARE               

SOFTWARE
  1. notnews: predict the type of news based on story text and URL
    Software

  2. Infer Race and Ethnicity From Names:

  3. Infer Gender From Names:

  4. Search a long list of names (patterns) in a large text corpus systematically and quickly
    Software

  5. Categorize the Content of Domains:

  6. Lost Years: Expected Number of Years Lost
    Python Package

  7. Know Your IP
    Python Package | Application

  8. Highlight Citations to Retracted Articles
    Website | Code

  9. AutoSum: Summarize Publications Automatically and Discover Miscitations
    Software

  10. Adjust Naive Estimates of Learning for Guessing
    R package | Related Paper

  11. Get Weather Data:
    Please read this before downloading any of the following scripts.

    • Find nearest zip codes given a list of weather stations (COOP and GHCND) via
      GeoNames: Data & Scripts
    • Find nearest weather stations given a list of zip codes: Data & Scripts
    • Get data from the nearest weather station given a list of zip codes and date range
      Script
    • Get data from the nearest weather station given a list of zip codes and date range
      using the NOAA web-service:  Script

  12. Image to Text:
    Please read this before downloading any of the following scripts.

  13. Edit Distance Based Search and Replace
    Software | Related Note

  14. Text as Data:

    • Normalize text, remove stop words, punctuation, numbers, stem, lemmatize
      Script
    • Subset, Randomly Sample, Summarize: Script
    • Create TDM with various weighting schemes: Script
    • Sentiment Analysis: Script
    • Supervised Learning: Classification, Regression

  15. Clarifai: Understand (Moving) Images
    R package | Analysis of Politicians' Instagrams | Infer Gender Based on First Name

  16. tuber: Access YouTube from R
    R package
    REVIEW: 'Thank you very much for the package ... it has made my life easy ....'

  17. tubern: R Client for the YouTube Analytics and Reporting API
    R package

  18. virustotal: R Client for the Virustotal Public API 2.0
    R package

  19. aws.alexa: Access Amazon Alexa from R
    R package

  20. Collecting Data from the Streets:

  21. Collecting, Parsing, and Processing Indian Electoral Rolls:

    • Collecting Indian Electoral Rolls
      Python scripts

    • Elector Count: Estimate the Total Number of Electors in a State
      Python script

    • Table Translator: Use Google Translate API to Get Word Level Translations And Append Translated Cell Values Back
      Python script

  22. countpy.com: counting more than downloads
    website

  23. incline: Estimate Trend at a Particular Point in Time in a Noisy Time Series
    Python Package