Website mashup with Apache Camel

Mashup ?

You are browsing on some websites and you see an interesting information, that you want to poll to be used into your system.

Unfortunately, you don’t know the website provider, and you don’t know if a “plug” is provided, for instance a WebService.

So you have to find a way to get the information.

Using Camel

You can create a Camel route looking like:

  <route>
    <from uri="timer:fire?period=2000"/>
    <setHeaders>
        <constant>POST</constant>
    </setHeaders>
    <to uri="http:blog.nanthrax.net?param=value"/>
    <unmarshal>
        <tidyMarkup/>
    </unmarshal>
    <setBody><xpath>//span[@class='date']</xpath></setBody>
    <to uri="log:blog"/>
  </route>

Here, every 2 seconds, we access to blog.nanthrax.net to get the HTML source. We can eventually provide some parameters (with POST method here).
On this HTML, we use tidy markup (from camel-tagsoup component) to cleanup the HTML code and format it in XML.
After that, we extract from the source, only the content of element.
Finally, we send the filtered content into the log.

Comments

Popular posts from this blog

Exposing Apache Karaf configurations with Apache Arrow Flight

Getting started with Apache Karaf Minho

Using Apache Karaf with Kubernetes