So working on my scraping project. It’s hurting my head. Scraping is no fun to figure out. What I’m stuck on right now is iterating. When I have some time tomorrow I’m going to work on completely understanding iterating when scraping. Here’s what I’m looking at currently:
<div class="social-icon-container">
<a href="https://twitter.com/empireofryan"><img class="social-icon" src="../assets/img/twitter-icon.png"></a>
<a href="https://www.linkedin.com/in/ryan-johnson-321629ab"><img class="social-icon" src="../assets/img/linkedin-icon.png"></a>
<a href="https://github.com/empireofryan"><img class="social-icon" src="../assets/img/github-icon.png"></a>
<a href="https://www.youtube.com/watch?v=C22ufOqDyaE"><img class="social-icon" src="../assets/img/rss-icon.png"></a>
</div>
Then I’m doing (among other things above it):
social = doc.css(".social-icon-container a")
social.each do |link|
I’m struggling to understand what |link| is inside the loop. I’m thinking it’s each instance of a
because there are 4 of those and I’m setting social
equal to the a
. However, I’m not sure. I know when I call doc.css(".social-icon-container a")
in pry I end up with Nokogiri’s version of everything inside the div. I’m sure I’ll crush it tomorrow. I know my logic of looping over each a
and using an if
or even case
to check if the img src=
is ==
to one of the social networks, then setting the href
value to equal a variable I’ll use later is the way to go and solid.
Time spent today: 2:08
Time spent total: 103:32
Lessons completed today: 0
Lessons completed total: 284