Monday, April 30, 2012

Web Searching 2.0 & Beyond presentation

Presentation conducted @ IBM in 2009

Presentation file
Web Searching 2.0.ppt

Presentation content


What is Web Searching 2.0 ?
WS 2.0 is about searching Web 2.0 Effectively
 How ?
By changing our searching mindset
Utilizing the modern web trends/tools in our daily searching



How search engine (SE) works ?
Crawling

Search Engine Optimization - SEO
What is it SEO?
      SEO is the process of improving the volume and quality of traffic to a web site from search engines via "natural" ("organic" or "algorithmic") search results.
Good SEO = High Ranking
      The earlier a site is presented in the search results the higher it "ranks“, the more searchers will visit that site.

How could website X be optimized for SEs ?
Adding relevant page Title and Meta Description
Good Navigation and Layout
Other website should Refer (link) to website X
Sitemaps ( an xml file describing your website )
Adding more Languages

Invisible Web - Deep Web

Deep Web
    The deep Web (also called Deep net, the invisible Web, or the hidden Web) refers to World Wide Web content that is not part of the surface Web, which is indexed by search engines.
    It is estimated that the deep web is 500 times larger than the surface Web

Searching – Where to search ?
Your PC
Google Desktop
Invisible Web
File sharing websites
Specialized Websites
Wikipedia.org
SlideShare.com
Directories
DMOZ.org
Search Engines => After 2 Slide
Meta Search Engine
Search.com
Metacrawler.com
Dogpile.com
Searching – Where to search ?
Specialized (Vertical) search engines
Koders.com
Wink.com
Social Bookmarks (Tagging)
Delicious.com
Digg.com
Visual search
SearchMe.com
Kartoo.com
Your Own search engine
Yahoo! Search BOSS  *
    http://developer.yahoo.com/search/boss/
Google Custom Search Engine
    http://www.google.com/coop/cse/
Others
Clusty.com


Searching – Where to search ?

You Decide …
How to search - Basics
List your search targets
 Search engine
 Specialized website
 …
List your search terms
Generate search strings from search terms
Search
Fine-tune your search strings

How to search - Advanced

                    
                   
           
How to search - Advanced
How to search - Advanced

Google - Specialties

Search Tips/Tricks
Change language when appropriate
Always use advanced operators
Think of the best view of your search ( filetype:ppt )
Write your search term as a question when appropriate
Don’t rush to search engine forms; plan first


Web Searching 20XX
Web 3.0
    Nova Spivack defines Web 3.0 as the third decade of the Web (2010–2020) during which he suggests several major complementary technology trends will reach new levels of maturity simultaneously



Web 3.0  - Semantic Web - Concepts

Web 3.0
    Nova Spivack defines Web 3.0 as the third decade of the Web (2010–2020) during which he suggests several major complementary technology trends will reach new levels of maturity simultaneously



Symantec Web - Technical
Semantic Web
Standards

RDF – Store data as “triples”


OWL – Define systems of concepts called “ontologies”

Sparql – Query data in RDF

SWRL – Define rules

GRDDL – Transform data to RDF




Symantec Web - Technical
RDF Triples


Symantec Web - Technical
Semantic Web
The Web is the DB





Implementations
Hakia.com
cluuz.com
Twine.com
PowerSet.com

IBM & Web Searching
IBM OmniFind Personal E-mail Search
    http://www.alphaworks.ibm.com/tech/emailsearch


    A powerful semantic search engine that enables you to search your e-mail easily and effectively; plug-ins are available for Microsoft Outlook and Lotus Notes mail systems.

OmniFind Enterprise Edition
    http://www-01.ibm.com/software/data/enterprise-search/omnifind-enterprise/features.html?S_CMP=wspace


    Pre-built integrations to more than 25 enterprise sources, including Lotus® Domino®, Windows SharePoint Services, FileNet repositories, Documentum repositories, shared file systems, relational databases, Microsoft® Exchange, and many others.

    Open platform for processing unstructured information to enable semantic queries, navigation of business intelligence results, and custom analytics applications.



IBM & Web Searching
Unstructured Information Management Architecture
    http://www.alphaworks.ibm.com/tech/uima

    Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. UIMA is a framework and SDK for developing such applications. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.


IBM Web Ontology Manager
    http://www.alphaworks.ibm.com/tech/wom


    IBM Web Ontology Manager is a lightweight, Web-based tool for managing ontologies expressed in Web Ontology Language (OWL). With this technology, users can browse, search, and submit ontologies to an ontology repository. Developers can discover new ontologies without having to develop the ontology themselves; reusability is thereby promoted and development time and effort is reduced. This technology includes a Web interface for easy uploading of ontologies in an .owl format by any user of the system. It also includes an interface for generating (using Jastor) Java™ APIs from uploaded ontology files.


IBM & Web Searching

Semantic Layer Research Platform
    https://w3.opensource.ibm.com/projects/slrp/
    http://ibm-slrp.sourceforge.net/2006/12/01/what-is-the-ibm-semantic-layered-research-platform-contd/


Boca1.
Boca is the foundation of many of our components. It is an enterprise-featured RDF store that provides support for multiple users, distributed clients, offline work, real-time notification, named-graph modularization, versioning, access controls, and transactions with preconditions. Matt’s written more about Boca here . Along with Boca are included two subsystems which may also be interesting on their own:
Glitter.
 Glitter is a SPARQL engine independent of any particular backend. It allows interfaces to backend data sources to plugin to the core engine and generate solutions for portions of SPARQL queries with varying granularity. The core engine orchestrates query rewriting, optimization, and execution, and composes solutions generated by the backend. A Boca-specific backend allows SPARQL queries to be compiled to Boca’s temporal database schema.
Sleuth.
 Sleuth provides full-text search capabilities for text literals within Boca. Text literals are indexed with Apache Lucene, and the index also stores information about the named graph, subject, and predicate to which the literal is attached.
   
    A platform for building Semantic applications that use RDF, LSID and other Semantic Web technologies. The platform includes several components such as an RDF server with collections, acls, replication and transactions, client and web development kits including an Eclipse suite of plugins for RDF consumption



IBM & Web Searching

IBM Integrated Ontology Development Toolkit
    http://www.alphaworks.ibm.com/tech/semanticstk
   
    Ontology: an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts.

    IODT is a toolkit for ontology-driven development. This toolkit includes EMF Ontology Definition Metamodel (EODM) and an OWL ontology repository named Scalable Ontology Repository (SOR).

Others

IBM Multimedia Analysis and Retrieval System
IBM LanguageWare Miner for Multidimensional Socio-Semantic Networks
Anatomy Lens
Agent Building and Learning Environment
System Text for Information Extraction
IBM LanguageWare Miner for Multidimensional Socio-Semantic Networks
Scalable Highly Expressive Reasoner
Muffin
http://mufin.fi.muni.cz/imgsearch/

Projects
http://www.guardian.co.uk/media/2007/mar/05/bbc.newmedia

Summary
  1. WS 2.0 is about searching Web 2.0 Effectivly
  2. Most of the web (Invisible web) do not appear in search engines results
  3. Web searching is not limited to search engines
  4. Web 3.0 = Web 2.0 + Semantic Web
  5. IBM and WS


Thank You

References

WikiPedia
http://en.wikipedia.org/wiki/Crawling
http://en.wikipedia.org/wiki/Web_3.0
http://en.wikipedia.org/wiki/Semantic_Web
http://en.wikipedia.org/wiki/Search_engine_optimization
http://en.wikipedia.org/wiki/Deep_web
http://en.wikipedia.org/wiki/List_of_search_engines
http://en.wikipedia.org/wiki/Social_bookmarking
http://en.wikipedia.org/wiki/Metasearch_engine

Search Engines
http://www.google.com/support/?ctx=web
http://help.yahoo.com/l/us/yahoo/search/basics/basics-04.html
http://www.googleguide.com/

References

IBM
http://www.alphaworks.ibm.com/

http://w3.ibm.com/bluepedia/display/en/Semantic+Web
http://omnifind.ibm.yahoo.net/


References - Cont
Others
http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html
http://searchengineland.com/comscore-yahoo-microsoft-gain-share-but-google-breaks-7-billion-searches-14412.php
http://www.hybridsem.com/blog/2007/11/07/the-ultimate-guide-to-advanced-searching-within-yahoo-google-and-msn-operators/
http://www.ihelpyou.com/search-engine-chart.html
http://www.bruceclay.com/searchenginechart.pdf
http://www.bruceclay.com/serc_histogram/histogram.htm
http://www.seomoz.org/blog/the-search-engines-semantic-analysis-capabilities
http://www.usa.gov/webcontent/documents/SEO_Course%2007.ppt
http://www.brightplanet.com/resources/details/deepweb.html
http://www.windweaver.com/searchlinks.htm
http://vimeo.com/1062481
http://www.dlib.indiana.edu/education/brownbags/fall2008/Semantic_Web/SemanticWeb.ppt
http://pixelcort.com/259
http://computer.howstuffworks.com/web-30.htm
http://www.readwriteweb.com/archives/10_semantic_apps_to_watch_one_year_later.php
http://www.semwebcentral.org/?open
http://www.fastcompany.tv/video/ibm-uses-semantics-get-better-search


No comments:

Post a Comment