16

I have an assignment to use Java and C with MySQL database and compare the results and give reasons as to why such result.

No. of Records  Execution time (ms)
Records     Java     C
100         586      76
500         628      216
2000        733      697
5000        963      1056
10000       1469     2178

As you can see, with less number of records being fetched from the database, C(ODBC) performed better. But as the number of records were increased, Java(JDBC) came out as the winner.

The reason that I thought of is that may be the ODBC drivers load much faster than JDBC but the access speed of JDBC is better than ODBC, hence, such results. However, I am not able to find such reasoning anywhere.

Any suggestions please ?

user2864740
  • 60,010
  • 15
  • 145
  • 220
user3213918
  • 333
  • 1
  • 6
  • 11
  • 1
    I'm pretty sure it's a benchmarking issue. Writing proper Benchmarks is actually quite difficult. Can you show the respective C and Java code you're using? I'm pretty sure the numbers should be almost identical. Also, what kind of data are you using? 500 ms for 100 records seems ridiculously high. Even the 76 ms in C seems somewhat high. What kind of times do you get through the MySQL client? – Mikkel Løkke Feb 17 '14 at 11:33

5 Answers5

14

Statements presented by mathworks website, these appear to be generally applicable.

Deciding Between ODBC and JDBC Drivers

Use native ODBC for:

  • Fastest performance for data imports and exports
  • Memory-intensive data imports and exports

Use JDBC for:

  • Platform independence allowing you to work with any operating system (including Mac and Linux®), driver version, or bitness (32-bit or 64-bit)
  • Using Database Toolbox functions not supported by the native ODBC interface (such as runstoredprocedure)
  • Working with complex or long data types (e.g., LONG, BLOB, text, etc.)

Tip:

  • On Windows systems that support both ODBC and JDBC drivers, pure JDBC drivers and the native ODBC interface provide better connectivity and performance than the JDBC/ODBC bridge.
Community
  • 1
  • 1
Dennis Jaheruddin
  • 21,208
  • 8
  • 66
  • 122
3

The following points may help:

Multithread: - JDBC is multi-threaded - ODBC is not multi-threaded (at least not thread safe)

Flexibility: - ODBC is a windows-specific technology - JDBC is specific to Java, and is therefore supported on whatever OS supports Java

Power : you can do everything with JDBC that you can do with ODBC, on any platform.

Language: - ODBC is procedural and language independent - JDBC is object oriented and language dependent (specific to java).

Heavy load: - JDBC is faster - ODBC is slower

ODBC limitation: it is a relational API and can only work with data types that can be expressed in rectangular or two-dimensional format. (it will not work with data types like Oracle’s spatial data type)

API: JDBC API is a natural Java Interface and is built on ODBC, and therefore JDBC retains some of the basic feature of ODBC

soulemane moumie
  • 1,005
  • 1
  • 14
  • 26
3

Are you sure you are comparing drivers and not whole environments?

I see that for ODBC you use C program. Try ODBC driver with the same program you use to test JDBC but now use JDBC-ODBC bridge (I often use Jython for such things). Of course bridge adds some additional time. Also remember that JVM uses JIT -- the longer your application works the better performance.

Performance is important but for me much more important is stability of drivers. I had some problems with various ODBC drivers and now I prefer JDBC drivers. But even ODBC drivers can work with high-load multi-threaded servers for many months.

Michał Niklas
  • 53,067
  • 18
  • 70
  • 114
3

I always find these types of discussions about ODBC vs JDBC performance a bit off the mark, not unlike the discussions around JPA vs Hibernate.

A C++ app using an ODBC driver written in C will likely be lightning fast for the small portion of the database interaction taking place. Similarly, a Java program connecting to a database using a vendor's driver which has been optimized for their particular database will be pretty darn fast as well.

But looking at a request-response cycle with a database, the network latency involved in making the request is significantly greater than the overhead of the API. Similarly, the time it takes to search a database or update a record or hold a transaction alive will be orders of magnitude more significant that any efficiency one would garner from choosing ODBC over JDBC.

Use the driver recommended by the vendor given the language you are developing with. And leave the performance issues to the database admins and SQL developers to solve. This isn't where you're going to resolve your database bottlenecks.

Cameron McKenzie
  • 3,684
  • 32
  • 28
  • 2
    I generally agree but particular driver implementations can compound the effect of network latency with inefficient chunking / batches of data. For example the AWS Athena JDBC driver paged through requests at 1000 rows max; retrieving 1m rows would be 1000 pages. Each page is a request-response = 1000 extra round-trips. It was fixed by making it stream, with performance improvements of at least 2x up to 6x when fetching more than 10k rows see https://docs.aws.amazon.com/athena/latest/ug/release-note-2018-08-16.html This is not an ODBC vs JDBC thing, but drivers can affect performance. – Davos Jan 30 '20 at 12:33
  • Od Course they do , we had incredible performance gain using thirdparty drivers for one of our Servers vs the one provided by Oracle I think it was datadirect (?) or cdata , but the difference was huge. – arana May 13 '22 at 21:21
1

The point of the assignment was to introduce you to a real application example that compares the performance characteristics of a pre-compiled language C to a just-in-time (JIT) interpreted^ language Java.

^The difference between interpreted and compiled code is a pedantic argument, truthfully all code is interpreted at runtime; even assembly is converted to machine language at runtime, and machine language resolves to memory addresses and CPU instructions.

The important distinction here is pre-compiled vs JIT. The difference is how close the code gets to machine code at build-time vs run-time. C gets closer at build-time ( for the most part, ignoring recent advances in heuristic compilation in Java).

This assignment scenario adds in the realistic complication of database data retrieval, which partly serves to increase the numbers so you have something to work with, but mostly to accentuate that the program time consists of fixed (start up) and variable (in this case I/O bound) phases. It could also have been a CPU-bound example, the same pattern would have emerged.

It starts to be easier to see if you graph the values in a chart.

Graph showing Java vs C app runtimes illustrating Javas fixed JIT startup cost

Look at the slope of the two lines. I think it's an anomaly that in your numbers the C program is slower over 4000+ records, but that's not important and should not distract from the core point of the example.

  • The C app starts with 100 rows in 76ms, quite close to the origin at 0
  • The Java app starts at 586ms, quite close to 500.

The C app startup time is negligible so can be effectively ignored.

The Java app startup time, on the other hand, is around 500ms. It is the JIT compiler interpreting the Java bytecode on the Java Virtual Machine (JVM); it's a fixed start-up cost that you pay for each run of the code no matter how much data you retrieve from the database.

You can consider that 500ms to be the real origin of your data access times, and see that the program grows linearly from that 500ms point.

It is not the drivers JDBC vs ODBC per say, however, the JDBC driver is also a Java library that is effectively an extension of your program and subject to the same JIT, similarly the ODBC driver is also pre-compiled C library that is effectively an extension of your program. It wouldn't be fair to blame the drivers specifically (assuming they are both optimized) it is more about the application context as a whole.

Davos
  • 5,066
  • 42
  • 66