Presentation file
Web Searching 2.0.ppt
Presentation content
What is Web Searching 2.0 ?
WS 2.0 is about searching Web 2.0 Effectively
How ?
By changing our searching mindset
Utilizing the modern web trends/tools in our daily searching
How search engine (SE) works ?
Crawling
Search Engine Optimization - SEO
What is it SEO?
SEO is the process of improving the volume and quality of traffic to a web site from search engines via "natural" ("organic" or "algorithmic") search results.
Good SEO = High Ranking
The earlier a site is presented in the search results the higher it "ranks“, the more searchers will visit that site.
How could website X be optimized for SEs ?
Adding relevant page Title and Meta Description
Good Navigation and Layout
Other website should Refer (link) to website X
Sitemaps ( an xml file describing your website )
Adding more Languages
Invisible Web - Deep Web
Deep Web
The deep Web (also called Deep net, the invisible Web, or the hidden Web) refers to World Wide Web content that is not part of the surface Web, which is indexed by search engines.
It is estimated that the deep web is 500 times larger than the surface Web
Searching – Where to search ?
Your PC
Google Desktop
Invisible Web
File sharing websites
Specialized Websites
Wikipedia.org
SlideShare.com
Directories
DMOZ.org
Search Engines => After 2 Slide
Meta Search Engine
Search.com
Metacrawler.com
Dogpile.com
Searching – Where to search ?
Specialized (Vertical) search engines
Koders.com
Wink.com
Social Bookmarks (Tagging)
Delicious.com
Digg.com
Visual search
SearchMe.com
Kartoo.com
Your Own search engine
Yahoo! Search BOSS *
http://developer.yahoo.com/search/boss/
Google Custom Search Engine
http://www.google.com/coop/cse/
Others
Clusty.com
…
Searching – Where to search ?
You Decide …
How to search - Basics
List your search targets
Search engine
Specialized website
…
List your search terms
Generate search strings from search terms
Search
Fine-tune your search strings
How to search - Advanced
How to search - Advanced
How to search - Advanced
Google - Specialties
Search Tips/Tricks
Change language when appropriate
Always use advanced operators
Think of the best view of your search ( filetype:ppt )
Write your search term as a question when appropriate
Don’t rush to search engine forms; plan first
Web Searching 20XX
Web 3.0
Nova Spivack defines Web 3.0 as the third decade of the Web (2010–2020) during which he suggests several major complementary technology trends will reach new levels of maturity simultaneously
Web 3.0 - Semantic Web - Concepts
Web 3.0
Nova Spivack defines Web 3.0 as the third decade of the Web (2010–2020) during which he suggests several major complementary technology trends will reach new levels of maturity simultaneously
Symantec Web - Technical
Semantic Web
Standards
RDF – Store data as “triples”
OWL – Define systems of concepts called “ontologies”
Sparql – Query data in RDF
SWRL – Define rules
GRDDL – Transform data to RDF
Symantec Web - Technical
RDF Triples
Symantec Web - Technical
Semantic Web
The Web is the DB
Implementations
Hakia.com
cluuz.com
Twine.com
PowerSet.com
IBM & Web Searching
IBM OmniFind Personal E-mail Search
http://www.alphaworks.ibm.com/tech/emailsearch
A powerful semantic search engine that enables you to search your e-mail easily and effectively; plug-ins are available for Microsoft Outlook and Lotus Notes mail systems.
OmniFind Enterprise Edition
http://www-01.ibm.com/software/data/enterprise-search/omnifind-enterprise/features.html?S_CMP=wspace
Pre-built integrations to more than 25 enterprise sources, including Lotus® Domino®, Windows SharePoint Services, FileNet repositories, Documentum repositories, shared file systems, relational databases, Microsoft® Exchange, and many others.
Open platform for processing unstructured information to enable semantic queries, navigation of business intelligence results, and custom analytics applications.
IBM & Web Searching
Unstructured Information Management Architecture
http://www.alphaworks.ibm.com/tech/uima
Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. UIMA is a framework and SDK for developing such applications. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at.
IBM Web Ontology Manager
http://www.alphaworks.ibm.com/tech/wom
IBM Web Ontology Manager is a lightweight, Web-based tool for managing ontologies expressed in Web Ontology Language (OWL). With this technology, users can browse, search, and submit ontologies to an ontology repository. Developers can discover new ontologies without having to develop the ontology themselves; reusability is thereby promoted and development time and effort is reduced. This technology includes a Web interface for easy uploading of ontologies in an .owl format by any user of the system. It also includes an interface for generating (using Jastor) Java™ APIs from uploaded ontology files.
IBM & Web Searching
Semantic Layer Research Platform
https://w3.opensource.ibm.com/projects/slrp/
http://ibm-slrp.sourceforge.net/2006/12/01/what-is-the-ibm-semantic-layered-research-platform-contd/
Boca1.
Boca is the foundation of many of our components. It is an enterprise-featured RDF store that provides support for multiple users, distributed clients, offline work, real-time notification, named-graph modularization, versioning, access controls, and transactions with preconditions. Matt’s written more about Boca here . Along with Boca are included two subsystems which may also be interesting on their own:
Glitter.
Glitter is a SPARQL engine independent of any particular backend. It allows interfaces to backend data sources to plugin to the core engine and generate solutions for portions of SPARQL queries with varying granularity. The core engine orchestrates query rewriting, optimization, and execution, and composes solutions generated by the backend. A Boca-specific backend allows SPARQL queries to be compiled to Boca’s temporal database schema.
Sleuth.
Sleuth provides full-text search capabilities for text literals within Boca. Text literals are indexed with Apache Lucene, and the index also stores information about the named graph, subject, and predicate to which the literal is attached.
A platform for building Semantic applications that use RDF, LSID and other Semantic Web technologies. The platform includes several components such as an RDF server with collections, acls, replication and transactions, client and web development kits including an Eclipse suite of plugins for RDF consumption
IBM & Web Searching
IBM Integrated Ontology Development Toolkit
http://www.alphaworks.ibm.com/tech/semanticstk
Ontology: an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts.
IODT is a toolkit for ontology-driven development. This toolkit includes EMF Ontology Definition Metamodel (EODM) and an OWL ontology repository named Scalable Ontology Repository (SOR).
Others
IBM Multimedia Analysis and Retrieval System
IBM LanguageWare Miner for Multidimensional Socio-Semantic Networks
Anatomy Lens
Agent Building and Learning Environment
System Text for Information Extraction
IBM LanguageWare Miner for Multidimensional Socio-Semantic Networks
Scalable Highly Expressive Reasoner
Muffin
http://mufin.fi.muni.cz/imgsearch/
Projects
http://www.guardian.co.uk/media/2007/mar/05/bbc.newmedia
Summary
- WS 2.0 is about searching Web 2.0 Effectivly
- Most of the web (Invisible web) do not appear in search engines results
- Web searching is not limited to search engines
- Web 3.0 = Web 2.0 + Semantic Web
- IBM and WS
Thank You
References
WikiPedia
http://en.wikipedia.org/wiki/Crawling
http://en.wikipedia.org/wiki/Web_3.0
http://en.wikipedia.org/wiki/Semantic_Web
http://en.wikipedia.org/wiki/Search_engine_optimization
http://en.wikipedia.org/wiki/Deep_web
http://en.wikipedia.org/wiki/List_of_search_engines
http://en.wikipedia.org/wiki/Social_bookmarking
http://en.wikipedia.org/wiki/Metasearch_engine
Search Engines
http://www.google.com/support/?ctx=web
http://help.yahoo.com/l/us/yahoo/search/basics/basics-04.html
http://www.googleguide.com/
References
IBM
http://www.alphaworks.ibm.com/
http://w3.ibm.com/bluepedia/display/en/Semantic+Web
http://omnifind.ibm.yahoo.net/
References - Cont
Others
http://www.lib.berkeley.edu/TeachingLib/Guides/Internet/InvisibleWeb.html
http://searchengineland.com/comscore-yahoo-microsoft-gain-share-but-google-breaks-7-billion-searches-14412.php
http://www.hybridsem.com/blog/2007/11/07/the-ultimate-guide-to-advanced-searching-within-yahoo-google-and-msn-operators/
http://www.ihelpyou.com/search-engine-chart.html
http://www.bruceclay.com/searchenginechart.pdf
http://www.bruceclay.com/serc_histogram/histogram.htm
http://www.seomoz.org/blog/the-search-engines-semantic-analysis-capabilities
http://www.usa.gov/webcontent/documents/SEO_Course%2007.ppt
http://www.brightplanet.com/resources/details/deepweb.html
http://www.windweaver.com/searchlinks.htm
http://vimeo.com/1062481
http://www.dlib.indiana.edu/education/brownbags/fall2008/Semantic_Web/SemanticWeb.ppt
http://pixelcort.com/259
http://computer.howstuffworks.com/web-30.htm
http://www.readwriteweb.com/archives/10_semantic_apps_to_watch_one_year_later.php
http://www.semwebcentral.org/?open
http://www.fastcompany.tv/video/ibm-uses-semantics-get-better-search
Tweet
No comments:
Post a Comment