0

I have a simple table with the following columns

PrinterModel : Model of the printer.

PrinterName : Name of printer as it appears on the users computer.

PortName : Usually an IP, UNC path or USB.

MachName :Computer that printer was found on.

I have created the following query

SELECT PrinterModel, PrinterName, PortName
FROM [ksubscribers].[dbo].[vCurrPrinterInfo]
WHERE PrinterModel NOT LIKE '%Microsoft%'
      AND PrinterModel NOT LIKE '%PDF%'
      AND PrinterModel NOT LIKE '%HP ePrint%'
      AND PrinterModel NOT LIKE '%Snagit%'
      AND PrinterModel NOT LIKE '%Nuance%'
      AND PrinterModel NOT LIKE '%Amyuni%'
      AND PrinterModel NOT LIKE '%Foxit%'
      AND PrinterModel NOT LIKE '%Brother PC-FAX%'
      AND PrinterModel NOT LIKE '%Remote Desktop%'
      AND PrinterModel NOT LIKE '%Wondershare%'
ORDER BY PortName

The table is basically a collection of printers installed on various computers within an organisation. As the same printer can be installed on multiple computers (think network printer) the table is full of duplicates. We can use the PortName to identify duplicates, however, some rows will show a slight variation in PortName e.g. 10.0.0.200 and 10.0.0.200_1.

Is it possible to compare data in each row focusing only on the data in the PortName column and if the data is similar e.g. 10.0.0.200 vs 10.0.0.200_1 return a single row result?

Here is an example of the table

Model          |     Name         |  Port

OKI MC560 (POS) | OKI MC560 Try 1 | 10.0.0.200

OKI MC560 (PCL) | OKI MC560 Try 2 | 10.0.0.200

OKI MC560 (PCL) | OKI MC560 Try 3 | 10.0.0.200_1

OKI MC560 (TST) | OKI MC560 Try 1 | 10.0.0.200

...and ideally I would return

OKI MC560 (POS) | OKI MC560 Try 1 | 10.0.0.200

I am not too fussed which rows PrinterModel/Printer Name is returned as long as we end up with just one result it could be from any row.

I have reviewed a similar question here SQL compare data from two tables however I can't see how they are comparing the text string using the difference command?

Have attached a screen shot of the current query along with the data returned. Any advise would be greatly appreciated.

SQL Query Screen Shot

Community
  • 1
  • 1
Chiper
  • 3
  • 1
  • You can use `GROUP BY PortName`. If you use Grouping, the other rows in the Select area must be aggregated like `MAX(PrinterModel), MAX(PrinterName)`. If the PortName is different like in your example, you should make the PortName unique. For example remove the `_1` – Sebastian Siemens Feb 15 '17 at 06:46

1 Answers1

0

Try something like this:

Select max(PrinterModel), max(PrinterName), left(PortName,charindex("_",PortName+"_")-1)
from [ksubscribers].[dbo].[vCurrPrinterInfo]
where ......
group by left(PortName,charindex("_",PortName+"_")-1)

That should give you the required results assuming you don't care which one of the possible values appear in the PrinterModel and PrinterName columns.

The "left" function effectively strips away the parts that are from the "_" and and on.

Adding the _ will cover the cases where there is no _ at the end.

theblitz
  • 6,683
  • 16
  • 60
  • 114
  • 1
    Maybe there are other Portnames, which are longer than 10 characters. Like in the picture he provided. – Sebastian Siemens Feb 15 '17 at 07:06
  • @SebastianSchulz True. Though the ones in the picture are actually shorter not longer. :) – theblitz Feb 15 '17 at 07:09
  • Correct their are all kinds of weird and wonderful port names. Here are a few examples – Chiper Feb 15 '17 at 07:36
  • Correct their are all kinds of weird and wonderful port names. Can we use a regular expression to identify an ip address and ignore anything but that? i.e. any text/special chars preceding or post the IP format (xxx.xxx.xxx.xxx) are stripped? Example port names (ignore the - space). - 192.168.0.99 - 192.168.0.99_1 - 192.168.0.99_Letter - 192.168.0.99-2 - 192.168.0.99-Letter - 192.168.0.99Letter - IP_10.0.0.43 - IP_10.112.170.230 - LAN_192.168.0.201 - LAN_ZT230-200 - 53000 - 9100 - COM3: - DOT4_001 - LPT1: - LPT2: - USB001 - BRN001BA956CFCC - CNBJNP_D8492FFBCBF4 – Chiper Feb 15 '17 at 07:55
  • Did some more researches. I think we have part of the answer using group by which is great. The next step would seem to be using a function to do a search and compare using regular expression looking for the IP address. I have found this post https://social.msdn.microsoft.com/Forums/sqlserver/en-US/c2d49425-ff3f-4d13-8769-e0c63350a250/need-to-find-an-ip-address-pattern-in-a-column?forum=transactsql which provides the code to do the searching. Can anyone assist with incorporating it into my query? Happy to pay someone to do this for me as I'm no database guru and this is way out of my league. – Chiper Feb 15 '17 at 22:37
  • This is a Regex which matches an IP address: (\d+\.\d+\.\d+\.\d+) Using it in the patindex function will find the point at which that pattern starts in the string. – theblitz Feb 16 '17 at 07:44
  • Thanks so I presume I call the function and it operates on that column for every row? If that is the case do you have to write the pattern that it finds to a new column and then do a group by on that one? Sorry I really don't know much about SQL queries past the good old Select, From, Where. – Chiper Feb 16 '17 at 12:07
  • patindex("(\d+\.\d+\.\d+\.\d+) ", PortName) will give you the position at which the first digit of the IP address appears in the column for that row. Then you will need another one to figure out where it ends and do a substring. Gets very very messy in the end. – theblitz Feb 16 '17 at 12:45