0

I am having a Dirty database and i have to clean it, by first extracting city name from the address (which is written with several variation) and then standardize spelling the city name and replace all the data with standardize city name

I have made a Cities_look up table. in which all possible city name variations are written this table is having 2 columns for example

Standard_City_Name  Dirty_City Name
NEW YORK             NEW
NEW YORK             NY
NEW YORK             newyork
NEW YORK             New york
NEW YORK             ny
NEW YORK             NWYK

now i have to Trim city name from the address by looking up in this table comparing all the values in Dirty_City_Name column then triming it out. and then make new column of city and write there the Standard_City_Name

i am ding this

SELECT TRIM(TRIM(TRAILING Cities_lookup.Dirty_City_Name FROM Address) ), Cities_lookup.Dirty_City_Name
from Student left join Cities_lookup 
 on Student.Address like CONCAT('% ',Cities_lookup.Dirty_City_Name)

Can anyone help?

Matt
  • 22,721
  • 17
  • 71
  • 112
ssbb
  • 245
  • 2
  • 5
  • 12

1 Answers1

2

Because you know the address is at the end of the string, you can use a trick to replace it via stuff(). That is, just replace the characters in those positions with the empty string.

I find the easiest way to do this is by reversing the string, replacing the first +1 characters, and reversing again. In this case, is the length of the dirty city name and the "+ 1" is for the space:

select reverse(stuff(reverse(Address), 1, len(cl.Dirty_City_Name) + 1, '')), cl.Standard_City_Name
from Student s left join
     Cities_lookup cl
     on s.Address like '% '+cl.Dirty_City_Name
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • its giving error "The multi-part identifier "cl.Standard_City_Name" could not be bound." – ssbb Dec 11 '12 at 20:05
  • 1
    I fixed the alias. It was "sl" not "cl". – Gordon Linoff Dec 11 '12 at 20:12
  • Its giving error sying :S 'CONCAT' is not a recognized built-in function name. – ssbb Dec 11 '12 at 20:58
  • 1
    @ShanzaButt . . . That was in your original SQL. Are you familiar with SQL Server? It does not (historically) have a `concat1` function. – Gordon Linoff Dec 11 '12 at 21:01
  • Now how to add this function in SQL server? actually i am beginner :S – ssbb Dec 11 '12 at 21:04
  • saw the edited code...now it is giving a new error "Invalid object name 'Student'." but Student is a table in my database :S – ssbb Dec 11 '12 at 21:13
  • 1
    @ShanzaButt . . . The `from` statement gives the tables shorter aliases. `Student` is refered to as `s` and `Cities_Lookup` as `cl`. The shorter aliases make it easier to read and write queries. – Gordon Linoff Dec 11 '12 at 21:18
  • how to fix this error? "Invalid object name 'Student'" student table exist in database. NOTE: Address is saved as a string in the data base :S – ssbb Dec 11 '12 at 21:23
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/20965/discussion-between-shanza-butt-and-gordon-linoff) – ssbb Dec 11 '12 at 21:24
  • how to use these results to update in another table's columns i am doing this but its not working update Transformed_All_Student SET Address= ( SELECT LEFT(Address,LEN(Address)-LEN(g.Dirty_City_name)) from All_Students p left join Cities_lookup g on p.Address like '% '+g.Dirty_City_name WHERE p.Address IS NULL ) where Address is null – ssbb Dec 14 '12 at 11:03