All datasets with keywords |
Search entries: |
Text mining` article` api` text` corpus` newspaper |
|
Information Extraction: The RISE Repository of Information Sources |
Text mining` information` text mining` extraction` reviews` jobs |
Text mining` links` text mining` books` rdf` ocr` documents |
|
Text mining` api` blog` comments` text mining` stream` trends` backtype` queryminer |
|
Free book usage data from the University of Huddersfield » "Self-plagiarism is style" |
Text mining` books` library` borrowing` recommender` isbn` recommendation` collaborative` filtering |
ICWSM 2009 - International AAAI Conference on Weblogs and Social Media |
Text mining` blog` crawl` corpus` network` web` link |
Change.gov: The Obama-Biden Transition Team | Join the Discussion: Healthcare |
Text mining` textmining` opinion` comment` topic` government` queryminer |
Opinion Extraction, Opinion Mining, Sentiment Analysis, Summarization of Customer Reviews |
Text mining` sentiment` mining` classification` machine learning` reviews` recommender` text mining` links |
Text mining` wikipedia` named entity` tagged` text ming |
|
Text mining` django` wikipedia` compressed` text mining` howto |
|
Text mining` reddit` api` json` |
|
Text mining` phising` corpus` text` email` text mining` nlp` mail` security |
|
Text mining` wikipedia` hadoop` textmining` links |
|
Text mining` question` answering` trec` nlp` machinelearning |
|
The New York Times Annotated Corpus « YooName - named entity recognition |
Text mining` named entity` nytimes` corpus` people` organizations` locations |
Text mining` named entity` location` place names` geo` nlp` natural language processing |
|
Text mining` book` data` wiki` via:jhammerb |
|
Text mining` faq` question_answering` questions` web` crawl` corpus` xml` textmining |
|
Wikipedia:Lists of common misspellings/For machines - Wikipedia, the free encyclopedia |
Text mining` spelling` mispelling` wikipedia |
Business and Finance` finance` api` social` kiva` microlending` lending |
|
Business and Finance` visualization` retail` finance` gis` map` location` store` via:magnetbox |
|
Business and Finance` finance` commercial` consumer` mint` spending |
|
Best Buy Remix - Welcome to the Best Buy Remix Developer Network |
Business and Finance` retail` data` api` product` bestbuy |
Behavioral Targeting, Analytics and Advertising Service for Publishers, Ad Networks |
Business and Finance` analytics` audience` segmentation` toolbar` commercial` sem` search` advertising |
Business and Finance` ceo` compensation` pay` economics` business` labor |
|
Business and Finance` trading` finance` s` api` list |
|
Business and Finance` netflix` api` movie` mashup` netflixprize` ratings |
|
Open beats Closed: Best Buy’s new APIs - O'Reilly Radar |
Business and Finance` retail` bestbuy` api |
Business and Finance` custom` research` retail` finance` market` service` analyst` |
|
Business and Finance` retail` dillards` uark |
|
developerWorks Interviews: Massive data mining and the resurgent mainframe |
Business and Finance` price` retail` transaction` sams_club` dillards |
Business and Finance` opentick` nasdaq` finance` stock |
|
Business and Finance` finance` links` sec |
|
Business and Finance` edgar` finance` sec` filing` ftp` instructions |
|
Business and Finance` investing` finance` datamining` announcement` sec` filing` links |
|
Government` un` voting` statistics` government |
|
Research Datasets :: CID Data :: Center for International Development at Harvard University (CID) |
Government` economics` international` development |
Government` government` banking` csv` tarp` bailout |
|
Government` dc` government` feeds` transparency` opendata |
|
Announcing the New York Times Campaign Finance API - Open - Code - New York Times Blog |
Government` nyt` api` campaign` donations` fec` |
Voter registration data; or, HERE IS YOUR HOPE, YOU FOOLS! « The Edge of the American West |
Government` voter` registration` politics`2008 |
import/parse/fec.py at master from aaronsw's watchdog — GitHub |
Government` fec` python` parser` government` campaign |
Government` government` transparency` parsing` election` python |
|
Dataset of the day: Where are the Obamacans? | Off the Map - Official Blog of FortiusOne |
Government` obama` goverment` mashup` gis` geo` map` campaign` donations |
Government` cmu` politics` campaign` donations` fec` via:jhammerb` government |
|
Government` timeseries` crime` statistics` publicdata |
|
Government` voter` voting` politics` government` name` address` registration |
|
Voter List Data Files - Election Department, Clark County, Nevada |
Government` voting` voter` registration` name` address` data` election` politics |
Government` UN` publicdata` government` statistics |
|
RealClearPolitics - Election 2008 - Democratic Presidential Nomination |
Government` polls` politics |
Government` crime` fbi |
|
Daily Kos: Obama helps us track $1,000,000,000,000 of federal spending |
Government` corruption` government` politics` finance` |
Government` government` money` politics` |
|
Government` campaign` politics` elections |
|
Government` usda` economics` population` cpi` gdp` income |
|
Government` government` directory` links` wiki` states |
|
Government` economics` links |
|
Government` economics` lumber` building` materials` homedepot |
|
Government` government` bridges` safety |
|
Twitter API Wiki / REST API Documentation: Social Graph Methods |
Network Analysis` graph` network` api` social` twitter |
Network Analysis` graph` network` link` wikipedia` pagerank |
|
Network Analysis` directory` businesses` twitter` companies |
|
Massive Scrape of Twitter’s Friend Graph « blog.infochimps.org - Organizing Huge Information Sources |
Network Analysis` textmining` twitter` network` socialnetwork` pagerank` graph` queryminer |
Network Analysis` twitter` socialnetwork` graph |
|
Network Analysis` wikipedia` named_entity` rdf` ontology |
|
ICWSM 2009 - International AAAI Conference on Weblogs and Social Media |
Network Analysis` blog` crawl` corpus` network` web` link |
Network Analysis` rdf` movies` movie` api |
|
Network Analysis` youtube` research` crawl` socialnetwork` network` graph` web |
|
API Documentation - Twitter Development Talk | Google Groups |
Network Analysis` twitter` text` api |
Network Analysis` wireless` RF` radio` signal` dartmouth` network |
|
Network Analysis` api` yahoo` music` artists |
|
Web Analytics` web` analytics` api` traffic` advertising` demographics` lookery |
|
Spatial Analysis` gis` geo` map` mapping` images` satellite |
|
Spatial Analysis` neighborhoods` geo` gis` maps |
|
Image Analysis and Video Analysis` fmri` neuroscience` python` neuralnetwork |
|
Image Analysis and Video Analysis` face` detection` image |
|
Image Analysis and Video Analysis` facerecognition` opencv` face` links |
|
NORB Object Recognition Dataset, Fu Jie Huang, Yann LeCun, New York University |
Image Analysis and Video Analysis` image` 3d |
Image Analysis and Video Analysis` images` photo` pictures` search |
|
Image Analysis and Video Analysis` activity` recognition` intent |
|
Image Analysis and Video Analysis` facerecognition` face` image` recognition |
|
Image Analysis and Video Analysis` images` audio` publicdata` maps` video` free |
|
Image Analysis and Video Analysis` image` vision` recognition` |
|
Image Analysis and Video Analysis` tracking` video` detection` image` recognition` vehicle` pedestrian` |
|
Image Analysis and Video Analysis` image` recognition` detection` pedestrian` thermal` tracking` facerecognition` illumination |
|
Carnegie Mellon University - CMU Graphics Lab - motion capture library |
Image Analysis and Video Analysis` gait` pedestrian` walk` motion |
Audio Analysis` sound` publicdomain` audio |
|
Bioinformatics` fmri` neuroscience` python` neuralnetwork |
|
Medical Informatics` timeseries` machinelearning` ecg` health` medical` sleep` apnea |
|
UC Berkeley. Sheldon Margen Public Health Library. Statistical/Data Resources |
Healthcare Analytics` health` links` resources` publichealth` berkeley |
Healthcare Analytics` google` health` trends` search` prediction` epidemiology` biodefence` queries |
|
Eigenvector Research, Inc. : Data Sets Available to Download |
Chemoinformatics` NIR` spectra` chemistry` semiconductor` pharmaceutical` matlab` |
Healthcare Analytics` duplicate |
|
Healthcare Analytics` health` information` public` publicdata |
|
Healthcare Analytics` mri` cardiac |
|
Demography` aging` statistics` studies |
|
Demography` poverty` statistics |
|
Demography` internet` demographics` online` web |
|
Demography` gis` census` rdf` semantic` sparql |
|
Sports Analysis` baseball` database` publicdata` statistics` sports |
|
It’s a Pitch-by-Pitch Scouting Report, Minus the Scout - New York Times |
Sports Analysis` baseball` gameday |
Network Analysis` urban` transportation` feeds` public` sanfrancisco` bart` api` |
|
Tim Davis: UF Sparse Matrix Collection : sparse matrices from a wide range of applications |
Matrices` spare` matrix |
Pre-processing` resources` links`mapping |
|
Amazon Web Services` amazon` ebs` ec2` s3` publicdata` hadoop |
|
Hosted Datasets` amazon` ebs` publicdata |
|
Web Analytics` workshop` search` web` microsoft` log` |
|
downloading - flossmole - Google Code - How to get FLOSSmole data for your own use |
Google` opensource` project` activity` mysql` dump |
Supervised` sentiment` review` product` amazon |
|
bizzare` scifi` phrase` name` word` generators` random` perl |
|
Phrases` webservice` api` thesaurus` textmining` nlp` rest` |
|
Search Query Performance report - Google AdWords Help Center |
Performance` adwords` ppc` search` metrics` webanalytics` sem` query` queryminer |
Web Analytics` queryminer` keyword` tool` research` commercial` search` adwords |
|
Network Analysis` links` catalogs` social |
|
Audio Analysis` lidar` visualization` radiohead` google` video |
|
Image Analysis and Video Analysis` images` words` english` search` visualization` imagemap |
|
Temporal Analysis` timeseries` anomaly` detection` astronomical` physics |
|
Image Analysis and Video Analysis` visualization` community` design` processing |
|
BGN: Domestic Names - State and Topical Gazetteer Download Files |
Demography` gis` usgs |
Random` benchmark` clustering` regression` machinelearning` list` statistics` mathematics |
|
Image Analysis and Video Analysis` nonlinear` dimensionality` reduction` faces` digits` images` manifold |
|
Yahoo! Search Blog: BOSS -- The Next Step in our Open Search Ecosystem |
oss` api` open` search` yahoo` BOSS` queryminer |
Download the Database - IP Address Lookup - Community Geotarget IP Project |
Network Analysis` geocoding` geoip` internet` ip` ipaddress` mysql |
Government` airline` statistics` finance` revenue` location` travel |
|
Show Us a Better Way: What public data is already available? |
Government` statistics` census` uk` school` news` publicdata |
Government` country` cities` geo |
|
Government` government` traffic` statistics` trends` transportation |
|
Government` via:inkdroid` libraries` mashup` rdf` semantic` search` semanticweb` books |
|
reddit.com: Ask Reddit: Where to download a DB dump of Reddit? |
Text mining` reddit` socialnetwork` news` web |
Text mining` collaborative` filtering` dating` rating` profiles` czech |
|
Business and Finance` predictionmarket` tool` finance` buzz` advertising` marketing` startup` mmds |
|
VGChartz.com | Video Games, Charts, News, Forums, Reviews, Wii, PS3, Xbox360, DS, PSP |
Business and Finance` sales` ranking` videogames` retail |
Business and Finance` retail` finance` sales` store` |
|
Image Analysis and Video Analysis` image` python` code` flickr` matlab` recognition |
|
Image Analysis and Video Analysis` image` recognition |
|
Network Analysis` tag` tagging` s |
|
Network Analysis` netflixprize` imdb` sparql |
|
Image Analysis and Video Analysis` machinelearning` motion` capture` sensor |
|
Text mining` api` buzz` opinion` trends` text` twitter` summize` search |
|
Image Analysis and Video Analysis` visualization` contest` scalability` motion` tracking` pedestrian` sensor |
|
Business and Finance` movie` revenue` sales` box_office` imdb` commercial` movie_study |
|
Business and Finance` movie` revenue` box_office` |
|
Live Search : xRank™ Celebrity — check out who’s hot and who’s not! |
Network Analysis` search` query` volume` trends` celebrity` prediction` buzz` named_entity |
Business and Finance` movie` revenue` timeseries` imdb` commercial` subsription |
|
Business and Finance` economics` links |
|
google` trends` search` web` analytics` api` code` python` hack |
|
google` trends` search` query` api` csv` keyword` timeseries |
|
Open Research - the Data: Lastfm-ArtistTags2007 - Duke Listens! |
last.fm` music` tagging` artists` tags` collaborative` filtering |
medical` obesity` |
|
tiger` gis` lectures |
|
geo` google` gps` location` geolocation` cell` wifi` api` gis |
|
celebrity` misspelling` spelling` names |
|
ImportGenius.com : U.S. Customs Database and Competitive Intelligence Tools |
commercial` shipping` imports` exports` finance` datamining |
betting` prediction` betfair` price` csv` predictionmarket |
|
news` text` articles` api` content` media` xml` images` publicdata |
|
scipy` python` machinelearning` statistics` resource |
|
wikipedia` pageviews` trends` textmining` seo` topic |
|
via:chl` wikipedia` web` analytics` seo` topic` textmining` traffic |
|
yahoo` geo` geocoding` location` landmarks` gis |
|
images` links` lists` archive` |
|
Yahoo offers geographic data to Web sites | Tech news blog - CNET News.com |
gis` webservice` yahoo` api` location` landmark |
query` search` log` excite` altavista` alltheweb` transaction |
|
TechTC - Technion Repository of Text Categorization Datasets |
datamining` textmining` categorization` classification` odp` directory` text |
textmining` classification` category` odp` directory |
|
FEC Election Contributions: Download Detailed Files by Election Cycle |
individual` donations` government` election` publicdata` fec |
search` statistics` keywords` analytics` api` python` web` seo` google |
|
mysql` states` countries` isocode |
|
hotels` geonames` |
|
locations` cities` countries` gis |
|
cities` gis |
|
corpus` text` similarity` terms |
|
web` crawler` bot` |
|
Data sets and corpus / corpora for biological literature and text mining |
bioinformatics` text` corpora` domainspecific` genomics` corpus` |
defect` recall` automobile` fightclub` nhtsa` saefty |
|
p2psim - kingdata : DNS server latency network distance matrices |
distance` matrix` network` p2p` dns` latency` nmf` queryminer |
pagerank` web` matrix` matlab |
|
opentick` trading` beta` feeds` finance |
|
wikipedia` xml` ec2 |
|
walmart` visualization` video` freebase` store` retail` locations` opening |
|
gis` mobile` geolocation |
|
cornell` web` archive` hadoop` crawl |
|
im2gps: estimating geographic information from a single image |
imagerecognition` via:csantos` gis` cmu` gps` imageprocessing` paper` hack` freaking_awesome |
image` video` audio` currency` sports` imagerecognition |
|
economics` list |
|
free` movie` database` netflixprize |
|
api` cogmap` person` name` organization` record_linkage |
|
retail` locations` stores |
|
record_linkage` identity` name` organization` orgchart` marketing |
|
German English Parallel Corpus "de-news", Daily News 1996-2000 |
german` translation` corpus` english` text` via:maxme |
neuroscience` patch` clamp` recordings` neuron` timeseries` patchclamp` data` neural |
|
aggregator` links |
|
retail` clickstream` traffic` web` links` sales |
|
Dolores Labs Blog » Blog Archive » Our color names data set is online |
colormap` color` mechanicalturk |
teradata` retail` transactional` database |
|
large` competition` challenge` svm` machinelearning` scalability |
|
ECIS 2007 - The 15th European Conference on Information Systems |
retail` dillards` sams_club |
alexa` aws` web` search` api` |
|
creativecommons` court` legal` law` via:inkdroid |
|
blog` web` text |
|
Lyricsfly Lyrics API, database access to search for music artist and song title |
song` lyrics` database` api` |
99 Wikipedia Sources Aiding the Semantic Web » AI3:::Adaptive Information |
links` directory` record_linkage` extraction` wikipeida` named_entity` recognition` textmining` semanticweb |
audioscrobbler` recommendation` collaborative` filtering` music |
|
directory` rdf` semantic` data` soup` graph |
|
Free Economic Data | Economic, Financial, and Demographic Data |
finance` economics` portal` links |
machinelearning` trading` competition` backtest` matlab` code` finance` via:DeliciousRob |
|
computer` vision` image` ray` trace` fingerprint` stereo` detection` via:chl |
|
The Dataverse Network Project | The Dataverse Network Project |
statistics` repository` harvard |
harvard` repository` social` science` research` portal` links |
|
climate` temperature` netcdf |
|
MNIST handwritten digit database, Yann LeCun and Corinna Cortes |
handwriting` mnist` image` recognition |
facerecognition` face` recognition` umass` image |
|
generator` names |
|
generator` tools` list` via:jd |
|
compete` api` web` statistics` traffic` analytics` mashup |
|
peekaboom` vision` image` large` human` computation` machinelearning` recognition |
|
links` oceanography` satellite |
|
blog` ucla |
|
nlp` corpus` tagged` named_entity` recognition` list |
|
del.icio.us` |
|
finance` links |
|
wikipedia` xml` structured` corpus |
|
arxiv` api` open` paper` academic` |
|
England Football Results Betting Odds | Premiership Results & Betting Odds |
gambling` soccer` football` excel` statistics |
rna` bioinformatics` microarray` expression` gene` machinelearning |
|
bioinformatics` microarray` expression` gene` machinelearning` stanford |
|
bioinformatics` microarray` expression` gene` machinelearning |
|
bioinformatics` microarray` expression` gene` machinelearning |
|
corpus` text` legal` law` court` ruling` opensource` publicdata |
|
python` finance` edgar` pylons` matplotlib` sec` webservice` via:jolby |
|
links` statistics |
|
Text Mining, Visualization and Social Media |
crawler` blog` corpus |
facerecognition` machinelearning` face` image |
|
umd` links` statistics` government` sports` via:rickladd |
|
biology` medicine` articles` text` journal` authors |
|
music` similarity` machinelearning |
|
Internet Archive: Details: Amazon ASIN listing and similarity graph |
ASIN` amazon` recommendation` collaborative` filtering` via:keyvowel |
weather` europe` ascii` netcdf |
|
machinelearning` datamining` cmu` link` collection |
|
driving` transportation` publicdata |
|
books` sales` commercial |
|
finance` data` |
|
searchengine` search` tagging` aggregator` numeric` extraction` tables` collaboration` web2.0 |
|
textmining` open` nature` standards` search |
|
metafilter` comments` network` via:chl |
|
web` search` spam` crawler` yahoo |
|
socialnetwork` trustnetwork` trust |
|
TaskForces/CommunityProjects/LinkingOpenData/DataSets - ESW Wiki |
opendata` semantic` rdf` collaboration |
publicdata` links |
|
semanticweb` rdf` congress` politics` government |
|
networks` research` graph` tags` paper` record_linkage |
|
archive` internet` web` index` |
|
competition` machinelearning` forecasting` contest |
|
microsoft` text` paraphrase` corpus |
|
nlp` text` corpus` ngram` google` commercial` license |
|
census` names` identity` frequency` record_linkage |
|
Given Name Frequency Project: Analysis of Given Name Popularity |
name` record_linkage` text` identity` code |
enron` names` identity` text` record_linkage |
|
api` identity` people` webservice` record_linkage |
|
Name Discrimination Data Named Entity Resolution / Entity Disambiguation |
record_linkage` corpus` nlp` names |
Developers Area - eBay Market Data Documentation - eBay Market Data Documentation |
ebay` api` retail` price` code |
name` authorship` rdf` record_linkage |
|
bibliography` rdf` ontology` duplicate` name` record_linkage |
|
StrikeIron Super Data Pack Web Service 1.0 - StrikeIron Marketplace |
webservice` publicdata` datacleaning |
Duplicate Detection, Record Linkage, and Identity Uncertainty: Datasets |
duplicate` detection` record_linkage` datacleaning` text |
datacleaning` record_linkage` video` lectures` course` cornell` economics` finance` publicdata |
|
retail` overstock` sales` api` product` price` forecasting |
|
Amazon Web Services Developer Connection : Can Alexa WS provide detailed ... |
finance` alexa` amazon` tech |
ebay` retail` pricing` sales` api` product |
|
face` image |
|
epidemiology` gis` health |
|
Google Trends API coming soon | Tech news blog - CNET News.com |
google` trends` api` |
social` activity` location` cell` gis |
|
machinelearning` reinforcement` agent` competition` |
|
optimization` vehicle` routing |
|
oil` energy` statistics` economics` petroleum |
|
search` pagerank` text` tags` content |
|
machinelearning` CMU` course` projects` graphicalmodel` code` paper |
|
Financial Forecast Center's Historical Economic and Market Data |
exchangerate` dollar` economics` |
economics` indicators` time` series |
|
finance` numberpedia` mechanicalturk` textmining` statistics |
|
socialnetwork` graphs` comicbooks |
|
dictionary` words |
|
wikipedia` authorship` |
|
tools` generator |
|
recommender` collaborative` restaurant |
|
community resource guide: i've been here before - show me the links |
demographics` maps` gis` statistics` links |
economics` social` government` health` labor` links |
|
netflix` netflixprize` movie` index` wikipedia` |
|
paper` corpus` arXiv |
|
links` transparency` government` politics` congress` reference |
|
Technophilia: Where to find public records online - Lifehacker |
public` records` links |
corpus` email` spam` textmining |
|
enron` corpus` email` text` social` network |
|
finance` cpi` inflation` data |
|
health` gis` epidemiology` links |
|
cia` population` python` code` grep |
|
Miller Center of Public Affairs - Richard Nixon - Oval Office Recordings |
nixon` speech` tapes` audio` mp3` wav` flac |
phone` politics |
|
housing` refinance` mortgage` |
|
retail` finance` sales` sqft` |
|
retail` finance` sales` sqft |
|
retail` location` poi |
|
retail` poi` location` gis` gps |
|
retail` location` gis |
|
smallworld` networking` socialnetwork` graph |
|
collaborative` filtering` jokes |
|
video` |
|
links` finance` commercial |
|
finance` xml` edgar` sec` code` perl |
|
EDGAR` sec` mail` text |
|
finance` SEC` scrape` parse` commercial |
|
Retail and Food Services - Time Series Data/Seasonal Factors |
retail` sales` census |
categorization` textmining` detection` tools |
|
retail` sales` uk |
|
tools` generator` random |
|
consumer` data` database` api |
|
factset` finance` |
|
finance` ibes` analyst` forecast` wharton |
|
finance` |
|
yahoo` finance` stock` price` |
|
network` links |
|
statistics` labor` government` consumer |
|
housing` sales` finance |
|
ethanol` |
|
retail` finance` store` locations` gis |
|
retail` gis` store` locations |
|
Energy Information Administration - EIA - Official Energy Statistics from the U.S. Government |
finance` government` energy` historical` forecasts` fuel` oil |
links |
|
product` upc` database` |
|
crawler` benchmark` search` web` links |
|
TechTC - Technion Repository of Text Categorization Datasets |
corpus` text |
traffic` data` |
|
volume rendering |
|
vision` caltech` image recognition |
|
pedestrian` image` classification` detection |
|
finance` economics` feed` free` stock` trading` opentick` opensource |
|
textmining` corpus` concordance` wordlist` n-gram |
|
dictionary` hack` security` wordlist` password |
|
data` mysql` email` energy` text` social network |
|
blog` corpus` spam |
|
corpus` text` newsgroup |
|
crowd sourcing` image` processing` algorithm` collaborative` distributed` web2.0` code` opensource |
|
paleo climatology` climate` oceanography` coral` sponge` biology |
|
finance` economics` naics` industry` classifications |
|
democracy` web2.0` mashup` government` funding` article |
|
collaborative` wiki` government` congress` politics` elections` web2.0` directory |
|
census` data` population` statistics |
|
statistical learning` machine learning` code` R` libraries` cran` |
|
linkd` datamining` timeseries` text` extraction` socialnetwork |
|
python` visualization` library |
|
machine learning` network` graph` |
|
aol` search` |
|
python` text` |
|
corpus` nlp` machine learning` textmining |
|
video` machine learning` statistics` matrix` sampling` large` sparse` algorithm` experiment_design |
|
wikipedia` laptop` install` dump |
|
ranking` search |
|
CN710: Comparative Analysis of Learning Systems (Spring 2006) - Class Project |
machinelearning` algorithm` ogi` bu` greyhound` finance |
python` urban` software` simulation` opensource` GIS` census` |
|
wikipedia` rdf` |
|
wikipedia` rdf` tools |
|
face` algorithm` facere cognition` data` image |
|
face` seung` algorithm` recognition` image |
|
extraction` finance` semantic` semanticweb` text |
|
aol` search` video` talk` algorithm` information retrieval` datamining` machinelearning |
|
aol` search` query` analysis |
|
aol` search` query` analysis |
|
aol` search` oracle` database` code |
|
query` categorization` algorithm` google |
|
Statistical NLP / corpus-based computational linguistics resources |
corpus` machine learning` text |
text` machine learning` context` matlab |
|
machine learning` code` links |
|
pagerank` code` algorithm |
|
Official Google Research Blog: All Our N-gram are Belong to You |
linguistics` google` ngram` nlp` record_linkage |
clustering` algorithm` java` parallel |
|
blog` econometrics` finance` machine learning` math` statistics |
|
Structural Analysis of Discrete Data and Econometric Applications, |
books` econometrics` economics` finance` ebook |
Kris Brower » Archives » Google Onpage Search Results Analysis |
google` ranking` aol` search` analytics |
netflixprize` machine learning` course` |
|
matrixmarket` matrix` |
|
Estimation of mean values, covariance matrices and imputation of missing values |
imputation` matlab` missing` EM` machinelearning |
face` image |
|
subset` netflix prize` dimensionality` reduction |
|
extract` from` graphs` hack` google` trends |
|
python` processor` semantic` web` rdf |
|
link` analysis` structure` web` crawler` stanford |
|
machine` learning` matlab` python` hackers` image |
|
flight data` airplane data` weather data` airline route data` aircraft flight data` in-flight analysis` airline on-time data |
|
healthcare analytics` subjective outcomes` healthcare customer service` quality of care` patient care` patient surveys |
|
Largest collection of longitudinal hospital care data in the US |
healthcare analytics` healthcare big data` research datasets` national in-patient statistics` local healthcare statistics` in-patient statistics` hospital cost data` hospital use data |
physician visits data` doctor visit data` outpatient care data` private practice physician data` non-federal healthcare data` physician office data |
|
mnist` xml` format |
|
mnist` |
|
data` set` collaborative` filtering` datamining` books` movie |
|
movie` netflix prize` source`netflix |
|
Submissions Guidelines for the Collectorz.com Online Movie Database |
movie` source |
plot` synopsis` movie` netflix prize` prize |
|
netflixprize` prize` european` movie` revenue` |
|
mediawiki` wikipedia` import` mysql` sql |
|
"phone ***" " address *" "e-mail" intitle:"curriculum vitae" - Google Search |
resume` google |
random` generator` database` sql |
|
Finance`Loans`business`investing |
|
spam`email`text analysis |
|
Data Sets | Pew Research Center's Internet & American Life Project |
demography |
flickr`taxonomy`images |
|
yahoo |
|
bibliographies`text mining |
|
weblog`blog`social media`network analysis |
|
facebook`network analysis`social |
|
Amazon Web Services` amazon` ebs` ec2` s3` publicdata` hadoop |
|
human language`text mining |
|
government`finance`economy |
|
images |
|
twitter`text mining`social |
|
spider`web analytics |
|
Amazon Web Services` amazon` ebs` ec2` s3` publicdata` hadoop |
|
youtube`image analysis`video analysis |
|
face recognition`facial recognition`image analysis |
|
data repositories |
|
learning |
|
movies`video analysis`business |
|
Translation Task - EMNLP 2011 Sixth Workshop on Statistical Machine Translation |
translation`human language |
books`text mining |
|
wordnet`corpus |
|
canada`parlaiment`government`text mining |
|
CRCNS - Collaborative Research in Computational Neuroscience - Data sharing |
Bioinformatics` fmri` neuroscience` python` neuralnetwork |
usenet`text mining |
|
bioinformatics |
|
chemoinformatics |
|
algorithms |
|
genetics`bioinformatics |
|
social science |
|
business |
|
network analysis |
|
books`text mining |
|
audio analysis |
|
health informatics`bioinformatics |
|
auctions |
|
image analysis`pets`cats |
|
Click Dataset | Center for Complex Networks and Systems Research |
web analytics |
The Electric Rice Cooker — One year of deleted weibos archive |
text mining |
Registered meteorites that has impacted on Earth visualized - AnalyticBridge |
meteorites`atmosphere |
road`traffic`accidents`transportation |
|
road`traffic`accidents`transportationccidents`transportation |