Text and Data Mining: what librarians need to know

33 %
67 %
Information about Text and Data Mining: what librarians need to know
Technology

Published on February 6, 2014

Author: EIFL

Source: slideshare.net

Description

Text and data mining of large datasets is often described as the new frontier for science and research.

This presentation is from a webinar hosted by the EIFL-Licensing Programme and the EIFL-IP (Copyright and Libraries) Programme on February 6, 2014 which can be found here: http://bit.ly/1iwr4io

In the webinar Benjamin White (Head of Intellectual Property at the British Library) provided a clear introduction to what text and data mining is, and how it differs from other methods of information retrieval

About EIFL

Working in collaboration with libraries in more than 60 developing and transition countries in Africa, Asia, Europe, and Latin America, EIFL enables access to knowledge for education, learning, research and sustainable community development. Visit eifl.net to learn more.

Connect to EIFL on:

Facebook - facebook.com/eIFL.net
Twitter - twitter.com/EIFLnet
LinkedIn - linkedin.com/groups/Friends-EIFL-1862455
Google+ - plus.google.com/+EiflNet/posts

Text and Data Mining: what librarians need to know EIFL-Licensing/EIFL-IP webinar, 6 February 2014 www.bl.uk 1

Ben White Ben O’Steen British Library

• Lorem ipsum dolor sit amet, consectetur adipiscing elit • Ut tristique lectus a massa tristique accumsan • Integer congue felis nec purus condimentum ultricies • Donec volutpat diam nec sapien lobortis malesuada • Morbi in dolor in lorem faucibus semper www.bl.uk 3

How Much Data is there? 2013 1.8 zetabytes? And 80% is unstructured. www.bl.uk 4

• Lorem ipsum dolor sit amet, consectetur adipiscing elit • Ut tristique lectus a massa tristique accumsan • Integer congue felis nec purus condimentum ultricies • Donec volutpat diam nec sapien lobortis malesuada • Morbi in dolor in loremfaucibus www.bl.uk 5

Learning and Research • For millennia learning has been based on people reading; • Taking notes; • Extracting facts and data; and • Organising information. www.bl.uk 6

Pre mid 1990s = pen, pencil and eyes . www.bl.uk 7

Computers can now read © Woodguy www.bl.uk 8

And a lot faster than humans www.bl.uk 9

How to Do Research in 2013? Post mid 1990s = pen, pencil, eyes AND computers. Are off the shelf text and data mining tools from software providers, but researchers write their own programmes too. www.bl.uk 10

What is Text and Data Mining? (NOT search by a search engine) Algorithms are “intelligently” analysing and reading the text / data (using statistics, probabilities, computational linguistics etc) to do amongst other things: i) Make assumptions what text strings are about - (e.g. Is the “tree” a piece of wood, a family tree, the tree of life (biology)?); ii) Analyse what the entire text is about; iii) See if there is a +ve or –ve relationship between two preselected variables. www.bl.uk 11

Text Mining Shakespeare www.bl.uk 12

What is Text and Data Mining? This allows for example people to: i) See if there is some kind of relationship between a chemical / enzyme etc and a medical disease; ii) Discover some previously undiscovered use for a drug or a chemical compound; iii) Allow organisations to organise electronic data by subject category etc. www.bl.uk 13

TDM & Libraries Libraries important as they provide access to scholarly information. A lot of text and data on the web but also very valuable content in books and journals. People want to hold the data locally and work on it using their own tools. www.bl.uk 14

Text and Data Mining – Big Business Video Time! (hopefully) http://www.youtube.com/watch?v=2YQNQ_GLe9Q www.bl.uk 15

Savings in the Health Sector www.bl.uk 16

www.bl.uk 17

New Medical Discoveries www.bl.uk 18

Reduces Reading Times Exponentially www.bl.uk 19

Not Just Computer Scientists Either © South Wiltshire Girls School www.bl.uk 20

The Right to Read is the Right to Mine? • Facts and data not subject to copyright and database rights • But computers have to copy in order to mine the data – so is it a licensable activity? (EU has an “internet browser” exception as browsers cache …) • European Union Commission stakeholder dialogue on TDM / “Licences for Europe” – Research / Library, Technology Sector and Open Access Publishers boycotted. www.bl.uk 21

The Right to Read is the Right to Mine? • How would you license the internet? • UKPMC – 75 publishers had articles with the word “malaria” in the title. BL’s estimate that from experience of negotiating a new licence it takes 16 months on average. • TDM goes across thousands / tens of thousands of articles which you ALREADY have legal access to. How can you renegotiate this with all publishers concerned? • UK universities experiencing server access being suspended automatically when abnormal access is being detected. www.bl.uk 22

Thank you (unless indicated otherwise) www.bl.uk 23

Now it’s question time!

Further information • Find out more about the EIFL-Licensing programme – www.eifl.net/licensing • Find out more about the EIFL-IP programme – www.eifl.net/copyright

Stay connected • Visit our website - www.EIFL.net • Subscribe to our newsletter www.EIFL.net/subscribe • Join email lists for EIFL programmes • facebook.com/EIFLnet • twitter.com/EIFLnet • www.flickr.com/photos/EIFL

Add a comment

Related presentations

Related pages

Text and Data Mining: What Librarians Need to Know | EIFL

Text and Data Mining: What Librarians Need to Know. ... Home > Resources > Text and Data Mining: What Librarians Need to ... of what text and data mining ...
Read more

Text and Data Mining: what librarians need to know ...

The Library Future, It’s Up to You: What Tomorrow’s Librarians Need to Know Today
Read more

Text and Data Mining: what librarians need to know - EIFL

www.bl.uk 11 What is Text and Data Mining? (NOT search by a search engine) Algorithms are “intelligently” analysing and reading the text / data (using ...
Read more

Text & Data Mining—A Librarian Overview

Libraries and librarians ... librarians need to ... My goal here is to provide a quick librarian overview of text and data mining, or TDM as we will now ...
Read more

Webinar: Text and data mining: what librarians need to know

... Text and data mining: what librarians need to know. Start date ... http://aims.fao.org/events/webinar-text-and-data-mining-what-librarians-need-know
Read more

Www.bl.uk 1 Text and Data Mining: what librarians need to ...

PowerPoint Presentation Text and Data Mining: what librarians need to know EIFL-Licensing/EIFL-IP webinar, 6 February 2014 www.bl.uk â ¹Nâ º Ben White ...
Read more

Ontologies: What Librarians Need to Know

Ontologies: What Librarians Need to Know ... •you need to create systems for data mining and text processing ... How will you find the data you need ...
Read more

Data Sharing and Discovery: What Librarians Need to Know

... this review paper describes what librarians need to know about how ... different types of data article integration. Text mining is another ...
Read more

Elsevier Library Connect Event 2016 - Navigating the new ...

... Navigating the new publishing & open science terrain: what librarians need to know. ... data? What new services are librarians ... Text Data Mining ...
Read more