0

I use Github Pages for my personal website. They're upgrading from Jekyll 2 to Jekyll 3 and sending deprecation warnings. I complied with their warnings and switched from redcarpet to kramdown and from pygments to rouge. When I build locally (with bundle exec jekyll serve) everything works. But when I push the changes the syntax highlighting gets mangled wherever I have linenos in my code blocks.

This is the code block:

{% highlight python linenos %}
'''
scrape lyrics from vagalume.com.br
(author: thiagomarzagao.com)
'''

import json
import time
import pickle
import requests
from bs4 import BeautifulSoup

# get each genre's URL
basepath = 'http://www.vagalume.com.br'
r = requests.get(basepath + '/browse/style/')
soup = BeautifulSoup(r.text)
genres = [u'Rock']
          u'Ax\u00E9',
          u'Forr\u00F3',
          u'Pagode',
          u'Samba',
          u'Sertanejo',
          u'MPB',
          u'Rap']
genre_urls = {}
for genre in genres:
    genre_urls[genre] = soup.find('a', class_ = 'eA', text = genre).get('href')

# get each artist's URL, per genre
artist_urls = {e: [] for e in genres}
for genre in genres:
    r = requests.get(basepath + genre_urls[genre])
    soup = BeautifulSoup(r.text)
    counter = 0
    for artist in soup.find_all('a', class_ = 'top'):
        counter += 1
        print 'artist {} \r'.format(counter)
        artist_urls[genre].append(basepath + artist.get('href'))
    time.sleep(2) # don't reduce the 2-second wait (here or below) or you get errors

# get each lyrics, per genre
api = 'http://api.vagalume.com.br/search.php?musid='
genre_lyrics = {e: {} for e in genres}
for genre in artist_urls:
    print len(artist_urls[genre])
    counter = 0
    artist1 = None
    for url in artist_urls[genre]:
        success = False
        while not success: # foor loop in case your connection flickers
            try:
                r = requests.get(url)
                success = True
            except:
                time.sleep(2)
        soup = BeautifulSoup(r.text)
        hrefs = soup.find_all('a')
        for href in hrefs:
            if href.has_attr('data-song'):
                song_id = href['data-song']
                print song_id
                time.sleep(2)
                success = False
                while not success:
                    try:
                        song_metadata = requests.get(api + song_id).json()
                        success = True
                    except:
                        time.sleep(2)
                if 'mus' in song_metadata:
                    if 'lang' in song_metadata['mus'][0]: # discard if no language info
                        language = song_metadata['mus'][0]['lang']
                        if language == 1: # discard if language != Portuguese
                            if 'text' in song_metadata['mus'][0]: # discard if no lyrics
                                artist2 = song_metadata['art']['name']
                                if artist2 != artist1:
                                    if counter > 0:
                                        print artist1.encode('utf-8') # change as needed
                                        genre_lyrics[genre][artist1] = artist_lyrics
                                    artist1 = artist2
                                    artist_lyrics = []
                                lyrics = song_metadata['mus'][0]['text']
                                artist_lyrics.append(lyrics)
                                counter += 1
                                print 'lyrics {} \r'.format(counter)

    # serialize
    with open(genre + '.json', mode = 'wb') as fbuffer:
        json.dump(genre_lyrics[genre], fbuffer)
{% endhighlight %}

This is what I see locally:

enter image description here

This is what I see on Github Pages:

enter image description here

(Without linenos the syntax highlighting works fine.)

What could be happening?

Parzival
  • 2,004
  • 4
  • 33
  • 47

1 Answers1

1

I think I got it!

Your code block seems to be fine. No problem there.

Make sure you have added this into your _config.yml:

highlighter: rouge
markdown: kramdown
kramdown:
  input: GFM

Probably what you're missing is kramdown input: GFM, isn't it?

Well, I tested locally and worked fine. When uploaded to GitHub, worked fine as well. Should work for you too.

Let me know how it goes, ok? :)


UPDATE!

Add this to your stylesheet and check how it goes:

.lineno { width: 35px; }

Looks like it's something about your CSS styles that is breaking the layout! Keep tweaking your CSS and you're gonna be fine!

Virtua Creative
  • 2,025
  • 1
  • 13
  • 18
  • 1
    This is the [path](https://github.com/VirtuaCreative/test) to my "test" repo, if you want to check the source. – Virtua Creative Feb 11 '16 at 22:53
  • Thank you so much for going through all this trouble! Indeed, I was missing `kramdown: input: GFM`. Somehow adding that didn't solve the issue though. I inspected your HTML (view-source:http://virtuacreative.github.io/test/2016/welcome-to-jekyll1/) and it's practically identical to the HTML I have, except that your code block is enclosed within a `
    ` tag and mine is enclosed within a `
    ` tag instead. Other than that all the rest seems to be the same (I put my HTML in a gist: https://gist.github.com/thiagomarzagao/1fa0c6776ab207a7a086).
    – Parzival Feb 11 '16 at 23:18
  • So, somewhere in my repo I must be doing something terribly stupid that's mangling my HTML. I guess I'll keep tweaking here and there until I find out what it is. – Parzival Feb 11 '16 at 23:18
  • No trouble at all! Hmm.. interesting... perhaps it's not your html that is messed, might be your CSS styles... – Virtua Creative Feb 11 '16 at 23:24
  • Now, the most odd thing is: when the code block comes from back sticks ( ```language), the html is compiled to a `
    ` tag. When comes from `{% highlight %}`, it's compiled to a `
    ` tag ...
    – Virtua Creative Feb 11 '16 at 23:29
  • 1
    I gave a look at your repo and I saw two things in your `_config.yml` that called my attention: (1) there's still no `input: GFM` there - don't forget to add it. (2): You use the plugin `jekyll-paginate`. It's not included by default into Jekyll 3.x, so you need to add it to your `Gemfile` as `gem 'jekyll-paginate'`. – Virtua Creative Feb 11 '16 at 23:39
  • 1
    Hey, look: your local browser image shows the line numbers in one character space each! Instead of `10`, you can see it's `1` then `0`; for `11`, you see `1` then `1` again and so on. I'm quite sure that it's your CSS that is causing your issue!! Inspect element on your browser and change the width of this table column to check if that's it! For me this add some space to the line numbers' column: `.lineno { width: 35px; }`. Just test that! – Virtua Creative Feb 11 '16 at 23:46
  • 1
    Ha! That was spot on: I added `.lineno { width: 35px; }` to my syntax.css file and now the line numbers are no longer one character each, so they appear correctly. The code block is still not entirely unmangled though - the first 10 lines or so and last 10 lines or so are unnumbered and the lines that don't fit the code block horizontally are wrapped. But I guess that's a matter of keep tweaking syntax.css until I get it right. Many thanks! (Also: good eye for patterns! I had stared at those line numbers for hours and didn't realize they were 10, 11, 12, etc). – Parzival Feb 12 '16 at 00:54
  • 1
    Oh, about the `kramdown: input: GFM`, I had added it but served only locally (I managed to reproduce the mangling locally by installing `rouge` and `kramdown` from `bundler`). – Parzival Feb 12 '16 at 00:55
  • Awesome! I'll update the answer adding the CSS `lineno` class thing, would you mark my answer as correct? :) You know, it helps...! haha – Virtua Creative Feb 12 '16 at 00:59
  • 1
    Thanks! Happy Jekylling! :) If you need something else, here we are! ;) – Virtua Creative Feb 12 '16 at 01:06