53

I was doing the challenges from pythonchallenge writing code in ruby, specifically this one. It contains a really long string in page source with special characters. I was trying to find a way to delete them/check for the alphabetical chars.

I tried using scan method, but I think I might not use it properly. I also tried delete! like that:

    a = "PAGE SOURCE CODE PASTED HERE"
    a.delete! "!", "@"  #and so on with special chars, does not work(?) 
    a

How can I do that?

Thanks

Chad Bingham
  • 32,650
  • 19
  • 86
  • 115
kwoskowicz
  • 537
  • 1
  • 5
  • 10

7 Answers7

153

You can do this

a.gsub!(/[^0-9A-Za-z]/, '')
Alok Anand
  • 3,346
  • 1
  • 20
  • 17
20

try with gsub

a.gsub!(/[!@%&"]/,'')

try the regexp on rubular.com

if you want something more general you can have a string with valid chars and remove what's not in there:

a.gsub!(/[^abcdefghijklmnopqrstuvwxyz ]/,'')
arieljuod
  • 15,460
  • 2
  • 25
  • 36
  • 2
    I think this `[^A-Za-z ]` works better, in this case. Otherwise, if you have a sentence, which typically **should** start with a capital letter, you will lose your capital letters. You would also lose any `1337 speak`, or other possible crypts within the text. Case in point: `phrase = "Joe can't tell between 'large' and large." => "Joe can't tell between 'large' and large."` – ThaDick May 11 '17 at 18:03
8

When you give multiple arguments to string#delete, it's the intersection of those arguments that is deleted. a.delete! "!", "@" deletes the intersections of the sets ! and @ which means that nothing will be deleted and the method returns nil.

What you wanted to do is a.delete! "!@" with the characters to delete passed as a single string.

Since the challenge is asking to clean up the mess and find a message in it, I would go with a whitelist instead of deleting special characters. The delete method accepts ranges with - and negations with ^ (similar to a regex) so you can do something like this: a.delete! "^A-Za-z ".

You could also use regular expressions as shown by @arieljuod.

dee-see
  • 23,668
  • 5
  • 58
  • 91
6

gsub is one of the most used Ruby methods in the wild.

specialname="Hello!#$@"
cleanedname = specialname.gsub(/[^a-zA-Z0-9\-]/,"") 
dee-see
  • 23,668
  • 5
  • 58
  • 91
Pradeep
  • 116
  • 4
5

I think a.gsub(/[^A-Za-z0-9 ]/, '') works better in this case. Otherwise, if you have a sentence, which typically should start with a capital letter, you will lose your capital letter. You would also lose any 1337 speak, or other possible crypts within the text.

Case in point:

phrase = "Joe can't tell between 'large' and large." => "Joe can't tell between 'large' and large."

phrase.gsub(/[^a-z ]/, '') => "oe cant tell between large and large"

phrase.gsub(/[^A-Za-z0-9 ]/, '') => "Joe cant tell between large and large"

phrase2 = "W3 a11 f10a7 d0wn h3r3!" phrase2.gsub(/[^a-z ]/, '') => " a fa dwn hr"

phrase2.gsub(/[^A-Za-z0-9 ]/, '') => "W3 a11 f10a7 d0wn h3r3"

ThaDick
  • 193
  • 2
  • 11
2

If you don't want to change the original string - i.e. to solve the challenge.

str.each_char do |letter|
  if letter =~ /[a-z]/  
    p letter    
  end  
end  
AGS
  • 14,288
  • 5
  • 52
  • 67
0

You will have to write down your own string sanitize function, could easily use regex and the gsub method.

Atomic sample:

your_text.gsub!(/[!@\[;\]^%*\(\);\-_\/&\\|$\{#\}<>:`~"]/,'')

API sample:

Route: post 'api/sanitize_text', to: 'api#sanitize_text'

Controller:

  def sanitize_text
    return render_bad_request unless params[:text].present? && params[:text].present?
    sanitized_text = params[:text].gsub!(/[!@\[;\]^%*\(\);\-_\/&\\|$\{#\}<>:`~"]/,'')
    render_response( {safe_text: sanitized_text})
  end

Then you call it

POST /api/sanitize_text?text=abcdefghijklmnopqrstuvwxyz123456<>$!@%23^%26*[]:;{}()`,.~'"\|/
d1jhoni1b
  • 7,497
  • 1
  • 51
  • 37