Web Semantic
"Web Semantic" bundle created by rup3rt
- Nodalities
- ZDNet Semantic Web
- Tim Berners Lee
- Semantic on Digg
- SemWeb bookmarks
- Online Ontology Visualisation
- Ivan Herman - Semantic Web
- Planet RDF
- The Semantic Web Gang
- Web of Data
- Stefano's Linotype
- AI3 ::: Adaptive Information
- Kingsley Idehen
- Planet Semantico
- Semantic Overflow
In late 2011, Mark Nottingham, whom I very much admire on a personal and professional level, posted ‘
In late 2011, Mark Nottingham, whom I very much admire on a personal and professional level, posted ‘Linking in JSON‘ which triggered quite some discussion (see the comments there).
Back then already I sensed that the community at large is ready for the next aspect of the Web. A scalable, machine-targeted way to realise a global dataspace. And it’s happening as we speak.
Take JSON and HTTP (some use REST for marketing purposes) and add the capability of following (typed) links that lead you to more data (context, definitions, related stuff, whatever).
And here are the three current contenders in this space (in the order of stage appearance) – Microsoft’s OData JSON Format, The Object Network: Linking up our APIs, and – as I learned from Charl van Niekerk on #whatwg IRC channel tonite – A Convention for HTTP Access to JSON Resources.
What they all have in common is that they define ways to read, create, update and delete data objects, in the Web, based on JSON, using HTTP.
OData/JSON
Totally-objective-and-unbiased-verdict: around for some years now, great community, backed by big bucks, heavy-weight (they squeezed friggin APP into it), rather RESTful and becoming more and more a shadow Semantic Web.
The Object Network
Totally-objective-and-unbiased-verdict: too early to tell, really. Seems like a one-man-show, nice idea in theory, time will tell uptake. Many things, incl. link semantic seem half-baked and unclear. Good motivation and marketing but little ‘apps’ or demos to be of any interest.
A Convention for HTTP Access to JSON Resources
Totally-objective-and-unbiased-verdict: just learned about it, but seems to be influenced by CouchDB developments and experiences which means it can’t be that bad, can it?
Yeah, I guess I’ll have a closer look at this one.
Now, which one is your favorite? Did I forget any? Before you shout out JSON-LD or the likes now … hold your breath – my #1 requirement is that it does the Full Monty: I want to be able to CRUD, to follow my nose through the data and all this over HTTP. Anyone?
Filed under: FYI, IETF, Linked Data, NoSQL
The Provenance Working Group just had its second F2F meeting where we made substantial progress on a number of issues in creating a way to interchange provenance on the Web. We wanted to let the community know where were at and where we are going.
Overall, we have a good set of first drafts of the PROV family of documents but there’s still a ways to go in getting them all in-line with each other and well presented such that they are useful to the developer and user communities. This meeting focused on the issues of how we can make rapid progress while achieving that goal.
Simplify, Simplify, Simplify
We heard the response to our first working drafts and have been
simplifying the PROV Data Model. Our mantra has been simplify,
simplify, simplify. You’ll see some of it in our recent 3rd Working Draft
of PROV-DM. But there’s still a ways to go to get the document
where we want it to be, in particular, in terms of how constructs
and concepts are explained.
While most of the constructs will remain the same, at the F2F meeting, we agreed to simplify the notion of account (a container for provenance) to focus on the use case of provenance of provenance.
Two Broad Use Cases
At the meeting, it became clear that one of the hard parts of
devising a common interchange format was being able to support the
group’s two broad use cases:
- The ability to use the PROV vocabulary to make provenance statements about existing things on the Web. Think for example adding simple provenance metadata (i.e. authorship) in a web page.
- The ability to exchange PROV information between provenance systems where a static or fixed view of data is key. This is common in current provenance tracking systems. Think exchanging information between version control systems or two scientific workflow systems.
This realization helped the group in thinking about how to best explain PROV. Since PROV supports both use cases, we will aiming to first explain how to use it in the broad case and then describe how one can use it in use cases that require a more exacting view.
Working with Dublin Core
At the working group we are always aware of existing provenance
vocabularies on the Web. In particular, we are excited that Kai
Eckert will be leading a best practice document on how PROV works
with Dublin Core one of the
most widely use provenance vocabularies on the Web.
The Community
If you’re interested in PROV, we encourage you to first begin with
the PROV-Primer.
This is the best place to get an understanding of PROV. In the next
two months, we’ll be producing updated working drafts. We hope to
have a complete set for the community to review. In the meantime,
we are always interested in your input. If your using PROV now,
please let us know.
Every once in a while I add another piece of query optimization code to the
The RDF Web Applications Working Group has published three Last
Call Working Drafts:
* RDFa Core 1.1,
* RDFa Lite 1.1 and
* XHTML+RDFa 1.1.
Together, these documents outline the vision for RDFa in a variety of XML and HTML-based Web markup languages. RDFa Core 1.1 specifies the core syntax and processing rules for RDFa 1.1 and how the language is intended to be used in XML documents. RDFa Lite 1.1 provides a simple subset of RDFa for novice Web authors. XHTML+RDFa 1.1 specifies the usage of RDFa in the XHTML markup language.
Public reviews due by 21 February.
For collaborative international research projects in the area of intelligent information management, the Business School of Bern University of Applied Sciences (BUAS) in cooperation with research group Agile Knowledge Engineering and Semantic Web (AKSW) at Universität Leipzig opens positions for:
2 PhD and 1 PostDoc Position in Knowledge Engineering / Semantic Web
The positions are primarily based at BUAS (Switzerland) and funded by European FP7 projects and possibly Swiss national research grants. A close collaboration and ca. 4 research visits per year at AKSW research group at Universität Leipzig (Germany) are envisaged for the PhD students to complete their PhD program.
We offer
The stimulating environment of two research institutes in the
fields of Business Informatics, Semantic Web, Ontology Engineering,
Linked Data Web, Knowledge Management, Data integration and
Service-Oriented Architectures;
- Long-term collaboration with well-known academic institutions and major companies around the world;
- A multicultural working place with state-of-the-art infrastructure, a competitive salary and resources including funding for attending international conferences, PhD symposia, summer schools, etc.;
- competitively funded PhD positions close to the rate of the Swiss National Science Foundation (currently ca. CHF 41’000);
- competitively paid Postdoc positions commensurate with the pay scale of BUAS (starting at CHF 80’000 depending on experience).
We expect
- A strong background in Computer Science or related disciplines;
- Excellent software engineering skills with demonstrated proficiency in modern software development;
- The willingness to work in an international environment and combine formal scientific work with application-oriented research in order to solve real-world problems;
- Research interest and expertise in at least one of the following: knowledge representation and ontology languages, natural language processing, data management and integration, Semantic Web standards, business aspects of semantic systems;
- Prospective PhD students should fulfill the doctorate entrance requirements of Universität Leipzig (i.e. masters degree or equivalent)
- Proficiency in English and the willingness to learn one of the official Swiss languages (e.g. German, French, Italian).
To apply
Applicants should include a cover letter, curriculum vitae incl.
list of publications, a research statement and the names and
addresses of two referees, via email (PDF only) to ksm1@bfh.ch (Dr.
Michael Kaschewsky, Head of Research Group, Bern University of
Applied Sciences, Business School). Positions are open until
filled, but candidates are advised to apply by 1 March 2012. In
addition, qualified Postdoctoral researchers have the opportunity
to get funding for their position and additionally for a doctoral
position that they supervise independently but must apply by 15
February 2012 – if you are interested please contact us asap.
About us
Bern University of Applied Sciences (BUAS) is the regional leader
in applied science and research with seven departments across three
cities. Research at the Business School in Bern is nationally
leading and internationally renowned in the field of e-government
and applied informatics in the public sector.
AKSW research group at the Universität Leipzig is establishing
theoretical results and scalable implementations for the Semantic
Data Web (e.g. DBpedia, OntoWiki, DL-Learner). Particular emphasis
is given to areas such as ontology creation and manipulation,
knowledge extraction, ontology learning and information & data
integration on the Semantic Data Web.
Additional information regarding our research and projects as well
as further information concerning these positions is available at
http://bfh.ch and http://aksw.org.
Let me show you which steps have to be taken to generate a high-quality text mining application, ready to be used to annotate and to categorize any kind of text or documents covering nearly any domain. With our approach of thesaurus based text mining your documents can also be linked to the world of linked (open) data; enrich your documents with data from the LOD cloud!
Step 1. Generate a thesaurus by using a linked data source like DBpedia
As recently reported SWC has developed a tool called SKOSsy which can be used to extract seed thesauri from DBpedia. In our example I will generate a knowledge model describing the domain of “digital photography“. This step took around 15 minutes.
Step 2. Load the thesaurus into PoolParty and improve it to your needs
After the seed thesaurus has been loaded into PoolParty Thesaurus Manager you have many possibilities to enhance the knowledge model further: Add more categories, synonyms, relations etc. In this example I use the seed-thesaurus without any further improvements. This step took approximately 2 minutes.
Step 3. Generate an automatic text extractor on top of your thesaurus
This step took a couple of seconds and ended up in having generated a fast and reliable text mining application on top of PoolParty Extractor, ready to be used to enrich your documents with data from the LOD cloud.
You can try it out here: PPX Live-Demo
To try the extractor on your own, please take a look at the image above which shows a proper configuration, you have to insert the following UUID in the form: d35d4ddb-adc3-4ea5-b027-deacac03e391
Since our example is all about ‘digital photography’, we recommend to use text samples (or some fragments) like these ones to test the quality of PPX based text analytics:
- Digital Camera Image Noise (Results as HTML, RDF/XML)
- Nikon D3S In-depth Review (Results as HTML, RDF/XML)
- Introduction to Shutter Speed in Digital Photography (Results as HTML, RDF/XML)
- Digital Camera Sensors (Results as HTML, RDF/XML)
Let us know what you think about this straight-forward approach and your opinion about the quality of the results. We believe that thesaurus based text mining is in many cases an alternative to some other approaches, especially if you want to to enrich your content with information from the upcoming web of data.
Of course we would be happy to generate other demos
in the areas of your interest! Just get in contact with us by using
our contact
form.
Abstract
This article is a short summary of my experiances about leadership. However these points are very simple to write down every fresh leader should learn these things from one by one, to have deep understanding.
Nature of good leader
- Always be prepared
If you have a meeting with your team, you should always spend time to prepare for it - You are the Hero of your team
Whether you want it or not, you will be the one who will be followed, you should show always the best of yourself (if you are unstable, you team will be unstable, if you are focused, your team will be focused)
If you do extraordinary job, if you are on time at meetings, if you do late night jobs, if you do your best, your team will follow you - Give strict but honest valuations
Saying the hard truth is not always pleasant, but a must
Focus always on the improvement - A "Thank you" is more important than money
- Listen to your team
Most of the time, team can solve every problem facing with you, because your problem your teams problem, and your teams problem is yours. Listening carefully will always help to understand their inner thoughts, and yes it can cause further work for the leader, because good leader cares - Give vision
Saying day by day what to do to your team members will narrow their focus, and they will some time later just "follow the commands"
You should give them further vision, a goal, which is consist of not only numbers, but values. Vision, mission, value creating meetings could help to develop these goals. You should be at honest on these meetings as you can, sharing your hopes and what you want to be reality
Delegate tasks consists of
not only to say focused what to do, and what the expacted outcome
is, but also giving the "Why"- Sit next to your team members
-
Pair working will help to improve the efficiency of your job expecially when you have complex job and also will help to
- Be calm
Never loose your head, or if you does do that with a purpose - Be determined
Did you ever feel insecure yourself, and did not find the right way? Just go ahead! - You should choose between "A" or "B"
First glance it could be very fun, always giving the final word, but after some time it is not so simple. You should learn how to even giving a decision, because a bad decision is always better, than no decision. Taking the risks will do you a great leader - Don't be a sheep
Not everything others says is good, even if the majurity is swearing on their lifes, and even if the analisys says that
...and why this really works? Because we should solve the candle problem
Communication
- Keep eye contact during converstaion

-
You should look always into your collage's eyes
- Listen carefully
A good leader is a good listener - Use words like
- "We" instead of "I" or "You"
- "I trust in Your job"
- Your body tells the rest of the story
Your gestures and body language, are as important as you say in person. Learn how to controll it. - Smile
A smile shows that you like what you are doing
AGU 2011 was the 1st conference I attended after I came to RPI. There were many interesting activities in the conference and I feel this is a very rewarding experience. I had the poster for the semantic water quality portal and also helped a little bit with the RPI table in the academic section (by just being there). I went to several talks and visited some exhibit booths. We had a group lunch at Chevy’s, which was very fun!
Preparing the poster helped me to rethink the water quality
portal. Thank Evan for his poster for ISWC 2011, which was a very
good starting point for my AGU poster! Having the poster session
was even more fun and rewarding! I presented the portal to
researchers from various fields and countries. Most of the people I
talked to said that the portal is a nice and interesting project.
Some researchers gave me very helpful comments like:
1. bring in crowd sourcing, e.g. let users report problem
2. help farmers to identify polluted wells
3. we should have an approach for pulling new data from USGS and
EPA, e.g. some subscription
4. regulation management for users (insert/upload/delete)
5. consider allergic as use cases, possible conditions for allergic
alert: wind + time, a combination of pollutants
I went to several talks during AGU and got to know the cool projects that researchers from different organizations (EPA, Standford, UMD, Google, NASA) have been doing. It was impressive to see that how computer science has been widely and deeply used in geophysical research. And I felt that scientists from geophysical fields expect more cooperation with people from computer science.
I went to the exhibits twice and spent quite some time there. I used the wired network provided by Google to do my assignments for the AI course. I also listened the talk about Google earth engine, a very cool platform for geophysical scientists!
Attending a conference as huge as AGU indeed requires some energy but after all this is worthwhile.
Tips about travel reimbursement that Carol gave me today:
1. Keep boarding pass to show if you sit in economic class
2. check out at the hotel and get the folio to show that you
actually
stayed at the hotel for how many nights
3. get itemized receipts at restaurants
Thank Carol!
The W3C RDF Web Applications Working Group has published three Last Call Working Drafts today:
Together, these documents outline the vision for RDFa in a variety of XML and HTML-based Web markup languages. RDFa Core 1.1 specifies the core syntax and processing rules for RDFa 1.1 and how the language is intended to be used in XML documents or in HTML. RDFa Lite 1.1 provides a simple subset of RDFa for novice Web authors. XHTML+RDFa 1.1 specifies the usage of RDFa in the XHTML markup language.
A number of improvements have been made to RDFa 1.1 over the past year by working closely with Google, Microsoft, Yahoo! and the other search engine developers. Public review and comments have resulted in a number of further refinements to the language that eases the learning curve for beginner Web authors.
The release of these documents as Last Call Working Drafts is a signal to the public that the Working Group believes that all of the technical requirements, public comments and reported issues have been addressed. It is also an open invitation to the general public to review and provide feedback on the finalization of this technology via the RDF Web Applications Working Group mailing list, by 21 February.

Tomás Saorín & Juan Antonio Pastor Sánchez have published a Spanish Translation of the W3C Linked Library Data Incubator Group’s report “Datasets, Value Vocabularies, and Metadata Element Sets”, under the title “Conjuntos de Datos, Vocabularios controlados y Conjuntos de Elementos de Metadatos”
(This post was supposed to be posted a month ago. But I had some trouble accessing the TW weblog website when I was in China, so I have to post it now after I came back to Troy.)
AGU 2011 Fall Meeting was the first time I went to an academic conference. I was very excited when I learned I’ve got such an opportunity. My goals were to present our poster, to check out what it is like in such a conference, and to have an idea about what other people are doing in the Informatics area.
My poster was about the work with Eric Rozell on the temporal metadata modeling in VSTO. I presented its motivation and methodology to several people, and it certainly drew some interest. Our approach has been viewed as an effective way to deal with a large amount of data and to improve reasoning and searching capacities. It was suggested that a similar technique (in the sense of including the temporal range for a dataset to a granularity of days using time:DateTimeInterval) has been used for data indexing in relational databases in NASA. In terms of the presentation, I think putting our posters, publications, and demos into flash drives and distributing them to people was a very good idea. It greatly helped the interested audiences to understand our work more afterwards.
There were many other interesting work across a couple of sessions. For example, Nicholas Del Rios etc. from University of Texas at El Paso presented a semantic and provenance aware visualization framework (VisKo) that links data with visualization processes. It has been used to visualize data on behalf of Giovanni. It is able to capture data processing provenance and visualization provenance in PML. Besides posters, I also went several talks from different sessions. Though I failed to connect most of them to my research work, I thought it was nice to hear about what other people have been working on.
Another output for me was to meet people in the Earth Science and Informatics areas. Although the names I could remember were limited, what I saw was they are a group of people who show enthusiasm about their work. They believe in what they are doing and have the confidence in the accomplishment their work will bring. I really look forward to working with many of them.
To sum up, this was a great experience for me in the beginning stage of my Ph.D. career. Next time I will try to meet and talk to more people, and get more feedback about my own work.
A virtual natural resource? Doesn’t make sense, does it?
Let me explain.
Natural resources are derived from the environment. Many of them are essential for our survival while others are used for satisfying our wants.
… is with Wikipedia
I'd just like to plant a little flag in the sand. Big Data seems to be the flavour of the month (and is undeniably extremely useful and interesting), but I've a gut feeling that might be symptomatic of not seeing the wood for the trees (or maybe vice versa).
I've not thought this through much, but surely any trends/correlations/relationships that are important enough to be of interest should be detectable without having to build a terabyte+ store? Rather that trying to capture as much raw data as possible up front, I suspect a more productive approach long-term will be to work with (maybe federated) crawler farms, with lots and lots of algorithms running in parallel over what they see. If there are appropriate training feedback loops in place, the shape of algorithms themselves could be treated as the results of the analysis.
It could be argued that once you have accumulated a corpus of raw data you can subsequently throw whatever you like at it without having to get the raw data again. But that corpus will never be complete or truly fresh - as new data appears on the Web all the time. More critically, under normal circustances you can never be sure you've got a dataset that contains a good sample representation covering whatever unknowns you're exploring. But crawlers can be directed to favour slices of the Web that contain information relevant to your hypotheses.
So, in the context of the Web, the Web itself should be the only big data needed. Which gives a neat parallel in the other sciences: reality itself is the only database you'll ever need :)
Ok, in the same way that Big Sites (like Wikipedia/dbPedia) adds big value to the Web alongside lots of small pieces, loosely joined, the same no doubt goes for Big Data. But let's not forget the vice versa, a complementary Small Data approach.
Somewhat orthogonal to this, one way in which the Web is a game changer for data is that here the relationship between pieces of data (/documents) is at least as significant as those pieces of data stacked on top of each other. Link Rank is a special case, an aggregated, flattened view of link value. If topics and entities (i.e. thing in general, people, places, concepts etc) and their interrelationships are inferred and/or explicitly named, it should expose some interesting facets of how human knowledge works.






