xreferplus-dereksturdy

60 %
40 %
Information about xreferplus-dereksturdy

Published on October 10, 2007

Author: guestfbf1e1

Source: slideshare.net

Derek Sturdy Tikit Granite & Comfrey Non-legal content integration – issues, methods, benefits

Our founders Sir William Granite 1738 - 1813 Rev. Dr. Nicholas Comfrey 1742 - 1818 Tikit Granite & Comfrey

Their first employee Tikit Granite & Comfrey Miss Emma Hardfarthing, c. 1801

KM in perspective matter documents know-how external resources internal (including primary law, government, online commentary, non-legal content, "trade" sites, CDs, etc) marketing, project documents Tikit Granite & Comfrey

Outline Who needs to link to non-legal content? Linking via taxonomies implications for internal taxonomies taxonomy to taxonomy taxonomy to full text Linking by straight search Tikit Granite & Comfrey

Who needs to link to non-legal content?

Linking via taxonomies

implications for internal taxonomies

taxonomy to taxonomy

taxonomy to full text

Linking by straight search

Who are the users? Not primarily researchers because they know how to set about it anyway Non-legal content linking is mainly for lawyers at their desks marketing people services staff eg IT, secretaries Tikit Granite & Comfrey

Not primarily researchers because they know how to set about it anyway

Non-legal content linking is mainly for

lawyers at their desks

marketing people

services staff eg IT, secretaries

What do our users have in common? They want a complex issue made simple, which is impossible all that silly stuff about "integration" and "just give me a simple box" which results in 75,000 hits or nothing They will gratefully accept handsomely presented guidance Tikit Granite & Comfrey

They want a complex issue made simple, which is impossible

all that silly stuff about "integration" and "just give me a simple box" which results in 75,000 hits or nothing

They will gratefully accept handsomely presented guidance

What's wrong with Google? Nothing at all, except that your users do not know what is verified and what is rubbish even the "advanced" search is just one of those oh-so-nineties field things 50 pages * 20 hits at legal costs = ruin basically, far too much information because of all the junk on the web Tikit Granite & Comfrey

Nothing at all, except that

your users do not know what is verified and what is rubbish

even the "advanced" search is just one of those oh-so-nineties field things

50 pages * 20 hits at legal costs = ruin

basically, far too much information because of all the junk on the web

What does this actually mean? That all successful attempts to integrate valuable content are trying their own methods of getting round the structured – unstructured issue Is there a one size fits all answer? No, there isn't. Let's look at that ... Tikit Granite & Comfrey

That all successful attempts to integrate valuable content are trying their own methods of getting round the structured – unstructured issue

Is there a one size fits all answer? No, there isn't. Let's look at that ...

The internal only answer Relational databases (ie organised metadata) handle precision recall handle the updating issues handle lateral linking but sadly .... Outside your control is all the other external stuff which is still unstructured "content" – ie straight text – low value, but lots of it! This is a temporary phase, but it will see most of us out ..... Tikit Granite & Comfrey

Relational databases (ie organised metadata)

handle precision recall

handle the updating issues

handle lateral linking

but sadly ....

Outside your control is all the other external stuff which is still unstructured "content" – ie straight text – low value, but lots of it!

This is a temporary phase, but it will see most of us out .....

Ways to approach this Autonomy – designed for science, brilliant at science, rubbish for law Metadata – which means the taxonomy stuff in terms of added value – designed for soft subjects like law and social science Hybrid systems – like xrefer – which use ingenious software to try and cut down the costs of the metadata approach Tikit Granite & Comfrey

Autonomy – designed for science, brilliant at science, rubbish for law

Metadata – which means the taxonomy stuff in terms of added value – designed for soft subjects like law and social science

Hybrid systems – like xrefer – which use ingenious software to try and cut down the costs of the metadata approach

Why not purely automatic software? Because of the tiny legal vocabulary – 5000 terms, instead of 250,000 – with meaning dependent on context Because of the citation problem – not to be discussed in detail today In essence: automatic software needs one word to have one meaning, which is true in science (normally) and often not true of law (except at the highest level) Tikit Granite & Comfrey

Because of the tiny legal vocabulary – 5000 terms, instead of 250,000 – with meaning dependent on context

Because of the citation problem – not to be discussed in detail today

In essence: automatic software needs one word to have one meaning, which is true in science (normally) and often not true of law (except at the highest level)

What must a taxonomy deliver? Real help in finding things Therefore - no ambiguity! Comfort for users of collections have I got everything relevant? - comprehensiveness have I avoided irrelevance? - accuracy can I easily find similar stuff? – lateral linking Is it still true? if the firm knows anything about anything on which practice is based, do I know it too? Tikit Granite & Comfrey

Real help in finding things

Therefore - no ambiguity!

Comfort for users of collections

have I got everything relevant? - comprehensiveness

have I avoided irrelevance? - accuracy

can I easily find similar stuff? – lateral linking

Is it still true?

if the firm knows anything about anything on which practice is based, do I know it too?

Components: taxonomies Thesauri legal subject, legal work type geog./jurisdiction, industry/sector, assets Authority files built up for cases legislation own know-how documents grey paper Tikit Granite & Comfrey

Thesauri

legal subject, legal work type

geog./jurisdiction, industry/sector, assets

Authority files built up for

cases

legislation

own know-how documents

grey paper

The Three C’s Classification – subject matter Categorisation – types of work Citation – reference to other documents, but especially to legal authorities Cases Legislation Tikit Granite & Comfrey

Classification – subject matter

Categorisation – types of work

Citation – reference to other documents, but especially to legal authorities

Cases

Legislation

Where might these be applied? general www resources paid-for online resources primary law resources document management practice management know-how management The Firm External Resources Tikit Granite & Comfrey

example: Search Engine Applications general www resources paid-for online resources primary law resources document management practice management know-how management Tikit Granite & Comfrey

Classification - subject thesauri general www resources primary law resources document management practice management know-how management paid-for online resources Tikit Granite & Comfrey

Categorisation – type of work general www resources primary law resources paid-for online resources practice management document management know-how management Tikit Granite & Comfrey

Authority Files – exact Citations general www resources paid-for online resources primary law resources document management practice management know-how management Tikit Granite & Comfrey

Matter documents Know How Matter metadata KH m’data Metadata density and classification workloads Relatively simple, high volume Millions of documents Specialist, complex, low volume External Sources Specialist know-how Tikit Granite & Comfrey

Conclusions so far Only certain materials within the firm – know-how - will have detailed classification, categorisation and citation work done on them Most other materials in the firm will be classified at a high level only, or be classified by inheritance (eg documents within a matter file) Tikit Granite & Comfrey

Only certain materials within the firm – know-how - will have detailed classification, categorisation and citation work done on them

Most other materials in the firm will be classified at a high level only, or be classified by inheritance (eg documents within a matter file)

Direct taxonomy-taxonomy linking For the users - seriously cool and dead easy For IS staff - match terms not by their letters and spaces but by a one-off human reconciliation of meanings and context – ie some work Illustration: PLC Tikit Granite & Comfrey

For the users - seriously cool and dead easy

For IS staff - match terms not by their letters and spaces but by a one-off human reconciliation of meanings and context – ie some work

Illustration: PLC

Hybrid methodologies Use the taxonomy to guide the inexperienced user's thoughts to the topic concerned Drill-down and drill-up techniques are both useful drill down: start with the general, go to the particular drill up: choose a particular term, see if it exists, see the context and alternative terms Tikit Granite & Comfrey

Use the taxonomy to guide the inexperienced user's thoughts to the topic concerned

Drill-down and drill-up techniques are both useful

drill down: start with the general, go to the particular

drill up: choose a particular term, see if it exists, see the context and alternative terms

Transfer the idea You then use this approach, developed for your own internal resources – intranets, knowledge systems, DMS – to link out to external resources Illustration using xrefer here .... Tikit Granite & Comfrey

You then use this approach, developed for your own internal resources – intranets, knowledge systems, DMS – to link out to external resources

Illustration using xrefer here ....

Implications for your taxonomies Ambiguity remains the big enemy! Other enemies: "gosh aren't I clever" terms jobs for the boys/girls – which usually result in loss of jobs for the ...... Pointless complexity is the source of most ambiguity – simplify! segmented taxonomies are the neat way to simplify Tikit Granite & Comfrey

Ambiguity remains the big enemy!

Other enemies:

"gosh aren't I clever" terms

jobs for the boys/girls – which usually result in loss of jobs for the ......

Pointless complexity is the source of most ambiguity – simplify!

segmented taxonomies are the neat way to simplify

Ambiguity - continued If your taxonomies are not simple enough to avoid ambiguity, you should not be meddling with the idea at all Complex taxonomies are for academics with time and a clear need for lateral thinking to the n'th degree In legal and governmental practice, your users (as defined) may have the brain, but not the time, or may not have the brain Tikit Granite & Comfrey

If your taxonomies are not simple enough to avoid ambiguity, you should not be meddling with the idea at all

Complex taxonomies are for academics with time and a clear need for lateral thinking to the n'th degree

In legal and governmental practice, your users (as defined) may have the brain, but not the time, or may not have the brain

Ambiguity - continued Most ambiguity comes from a failure to grasp the point behind the metadata approach, which is "make it easy to find" Classification and categorisation are simply tools, not ends in themselves Ambiguity is what search engines do! Tikit Granite & Comfrey

Most ambiguity comes from a failure to grasp the point behind the metadata approach, which is "make it easy to find"

Classification and categorisation are simply tools, not ends in themselves

Ambiguity is what search engines do!

Direct search The user sees a word or phrase ... and does not understand it and wants to know more about it In the ideal world, she highlights it, clicks it, and gets seven, organised results In the real world, this does not happen ... but Tikit Granite & Comfrey

The user sees a word or phrase ...

and does not understand it

and wants to know more about it

In the ideal world, she highlights it, clicks it, and gets seven, organised results

In the real world, this does not happen ... but

Reference linking, concept blow-ups The jury is out on this at present If you do not know your topic, then you can be misled very easily If you have a smattering of knowledge, you can probably navigate successfully A little knowledge is much less dangerous than none, despite the proverb to the opposite effect! Tikit Granite & Comfrey

The jury is out on this at present

If you do not know your topic, then you can be misled very easily

If you have a smattering of knowledge, you can probably navigate successfully

A little knowledge is much less dangerous than none, despite the proverb to the opposite effect!

Where does this leave us? Correct choice of provider This remains the only way at present to handle the "integration" and "unstructured data" problem pure software doesn't do it – for us "integrators" are fine for bulletins, but often useless for research and briefings therefore the human element has to be introduced at some point or another Tikit Granite & Comfrey

Correct choice of provider

This remains the only way at present to handle the "integration" and "unstructured data" problem

pure software doesn't do it – for us

"integrators" are fine for bulletins, but often useless for research and briefings

therefore the human element has to be introduced at some point or another

Where is the point of human input? the metadata approach: after content has been published, the key content is indexed and abstracted the source selection approach: from certain sources, the content is of sufficient quality that it does not need weeding – it's already abstracted, in other words two sides of the same coin? Tikit Granite & Comfrey

the metadata approach: after content has been published, the key content is indexed and abstracted

the source selection approach: from certain sources, the content is of sufficient quality that it does not need weeding – it's already abstracted, in other words

two sides of the same coin?

Conclusions If you develop a single, large, unsegmented taxonomy, you will be stuck with search-engine approaches to external non-legal content If you think beyond legal, to office (ie admin), industries / sectors (ie marketing), and so on, you can develop hybrid approaches These will be more powerful than just search engines, though you need those too for esoterica The key to this remains: choose your external content providers – give them the problem! Tikit Granite & Comfrey

If you develop a single, large, unsegmented taxonomy, you will be stuck with search-engine approaches to external non-legal content

If you think beyond legal, to office (ie admin), industries / sectors (ie marketing), and so on, you can develop hybrid approaches

These will be more powerful than just search engines, though you need those too for esoterica

The key to this remains: choose your external content providers – give them the problem!

Tikit Granite & Comfrey

How it's done Tikit Granite & Comfrey classification by subject and work type identification of legal references authority files: subject work type cases legislation Doc 5 Doc 4 Doc 3 Doc 2 "link tables": piece of text to identified authority Doc 1

Add a comment

Related presentations