Without the vibrant Ruby open-source ecosystem I would not have been able to develop Promoter by myself. It uses over 45 Ruby gems today, many of them handling essential tasks. I have been wanting to extract some features from Promoter and make them available for other developers for some time, but didn’t get around to it until now.
Raev is a Ruby gem for fetching, parsing and normalizing meta data from websites. If you want to parse meta data from websites or RSS feeds you’re faced with the challenge that practically no one in the real world is using sensible standards or microformats. Raev tries to take away some of that pain by parsing and normalizing meta data to something more usable. The feature set of version 0.1.10 is still somewhat limited, but it offers some things that you might find useful:
- Fetch the Twitter handle from a website
- Fetch the RSS feed from a website
- Return the base domain (without www) for an url
- Resolve shortend or proxied urls and remove UTM analytics parameters
- Normalize the author name of a RSS feed entry
I’m planning to add more features to Raev that deal with scraping meta data from article pages, such as headline, publication date, and author name.