Monday, April 30, 2012

Web Searching 2.0 & Beyond presentation

Presentation conducted @ IBM in 2009

Presentation file
Web Searching 2.0.ppt

Presentation content

What is Web Searching 2.0 ?
WS 2.0 is about searching Web 2.0 Effectively
 How ?
By changing our searching mindset
Utilizing the modern web trends/tools in our daily searching

How search engine (SE) works ?

Search Engine Optimization - SEO
What is it SEO?
      SEO is the process of improving the volume and quality of traffic to a web site from search engines via "natural" ("organic" or "algorithmic") search results.
Good SEO = High Ranking
      The earlier a site is presented in the search results the higher it "ranks“, the more searchers will visit that site.

How could website X be optimized for SEs ?
Adding relevant page Title and Meta Description
Good Navigation and Layout
Other website should Refer (link) to website X
Sitemaps ( an xml file describing your website )
Adding more Languages

Invisible Web - Deep Web

Deep Web
    The deep Web (also called Deep net, the invisible Web, or the hidden Web) refers to World Wide Web content that is not part of the surface Web, which is indexed by search engines.
    It is estimated that the deep web is 500 times larger than the surface Web

Searching – Where to search ?
Your PC
Google Desktop
Invisible Web
File sharing websites
Specialized Websites
Search Engines => After 2 Slide
Meta Search Engine
Searching – Where to search ?
Specialized (Vertical) search engines
Social Bookmarks (Tagging)
Visual search
Your Own search engine
Yahoo! Search BOSS  *
Google Custom Search Engine

Searching – Where to search ?

You Decide …
How to search - Basics
List your search targets
 Search engine
 Specialized website
List your search terms
Generate search strings from search terms
Fine-tune your search strings

How to search - Advanced

How to search - Advanced
How to search - Advanced

Google - Specialties

Search Tips/Tricks
Change language when appropriate
Always use advanced operators
Think of the best view of your search ( filetype:ppt )
Write your search term as a question when appropriate
Don’t rush to search engine forms; plan first

Web Searching 20XX
Web 3.0
    Nova Spivack defines Web 3.0 as the third decade of the Web (2010–2020) during which he suggests several major complementary technology trends will reach new levels of maturity simultaneously

Web 3.0  - Semantic Web - Concepts

Web 3.0
    Nova Spivack defines Web 3.0 as the third decade of the Web (2010–2020) during which he suggests several major complementary technology trends will reach new levels of maturity simultaneously

Symantec Web - Technical
Semantic Web

RDF – Store data as “triples”

OWL – Define systems of concepts called “ontologies”

Sparql – Query data in RDF

SWRL – Define rules

GRDDL – Transform data to RDF

Symantec Web - Technical
RDF Triples

Symantec Web - Technical
Semantic Web
The Web is the DB


IBM & Web Searching
IBM OmniFind Personal E-mail Search

    A powerful semantic search engine that enables you to search your e-mail easily and effectively; plug-ins are available for Microsoft Outlook and Lotus Notes mail systems.

OmniFind Enterprise Edition

    Pre-built integrations to more than 25 enterprise sources, including Lotus® Domino®, Windows SharePoint Services, FileNet repositories, Documentum repositories, shared file systems, relational databases, Microsoft® Exchange, and many others.

    Open platform for processing unstructured information to enable semantic queries, navigation of business intelligence results, and custom analytics applications.

IBM & Web Searching
Unstructured Information Management Architecture

    Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. UIMA is a framework and SDK for developing such applications. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.

IBM Web Ontology Manager

    IBM Web Ontology Manager is a lightweight, Web-based tool for managing ontologies expressed in Web Ontology Language (OWL). With this technology, users can browse, search, and submit ontologies to an ontology repository. Developers can discover new ontologies without having to develop the ontology themselves; reusability is thereby promoted and development time and effort is reduced. This technology includes a Web interface for easy uploading of ontologies in an .owl format by any user of the system. It also includes an interface for generating (using Jastor) Java™ APIs from uploaded ontology files.

IBM & Web Searching

Semantic Layer Research Platform

Boca is the foundation of many of our components. It is an enterprise-featured RDF store that provides support for multiple users, distributed clients, offline work, real-time notification, named-graph modularization, versioning, access controls, and transactions with preconditions. Matt’s written more about Boca here . Along with Boca are included two subsystems which may also be interesting on their own:
 Glitter is a SPARQL engine independent of any particular backend. It allows interfaces to backend data sources to plugin to the core engine and generate solutions for portions of SPARQL queries with varying granularity. The core engine orchestrates query rewriting, optimization, and execution, and composes solutions generated by the backend. A Boca-specific backend allows SPARQL queries to be compiled to Boca’s temporal database schema.
 Sleuth provides full-text search capabilities for text literals within Boca. Text literals are indexed with Apache Lucene, and the index also stores information about the named graph, subject, and predicate to which the literal is attached.
    A platform for building Semantic applications that use RDF, LSID and other Semantic Web technologies. The platform includes several components such as an RDF server with collections, acls, replication and transactions, client and web development kits including an Eclipse suite of plugins for RDF consumption

IBM & Web Searching

IBM Integrated Ontology Development Toolkit
    Ontology: an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts.

    IODT is a toolkit for ontology-driven development. This toolkit includes EMF Ontology Definition Metamodel (EODM) and an OWL ontology repository named Scalable Ontology Repository (SOR).


IBM Multimedia Analysis and Retrieval System
IBM LanguageWare Miner for Multidimensional Socio-Semantic Networks
Anatomy Lens
Agent Building and Learning Environment
System Text for Information Extraction
IBM LanguageWare Miner for Multidimensional Socio-Semantic Networks
Scalable Highly Expressive Reasoner


  1. WS 2.0 is about searching Web 2.0 Effectivly
  2. Most of the web (Invisible web) do not appear in search engines results
  3. Web searching is not limited to search engines
  4. Web 3.0 = Web 2.0 + Semantic Web
  5. IBM and WS

Thank You



Search Engines



References - Cont

No comments:

Post a Comment