· Staff · 6 min read
SQLi WHY
As I was doing chores around the place today (thatching the lawn.. ugh). I saw a post about a new CVE in WordPress. It’s an SQLi… As a former dev, let me say this: THERE IS NO EXCUSE FOR SQLi. NONE. ZERO. NADA.
I also saw the fix mentioning sanitizing the data… UGH. NO, Just NO. Now to be fair, I haven’t delved into the WordPress code, but it seems odd to me that not sanitizing the input caused the query to not be parametrized. I do want to point out that parameterized queries are the only sure fire way to prevent SQLi… Should you sanitize your data? Most certainly, but don’t depend on it preventing SQLi. Stored procedures are NOT a substitute either, as they can be vulnerable to SQLi as well.
Wait, what? Stored procedures can be vulnerable to SQLi??? HOW?
Remember: SQL commands are at their heart, just strings… In fact you can write SQL statements to create and run other SQL statements, but I digress.
Consider this stored procedure:
CREATE PROCEDURE GetUserData
@Username NVARCHAR(50)
AS
BEGIN
DECLARE @SQLQuery NVARCHAR(MAX)
— Constructing the SQL query using string concatenation (vulnerable to SQL injection)
SET @SQLQuery = ‘SELECT * FROM Users WHERE Username = ''' + @Username + ''''
— Executing the dynamic SQL query
EXEC sp_executesql @SQLQuery
END
This creates a stored procedure which takes a user input, concatenates it into a string, then executes it. Classic SQLi
Instead write it this way:
CREATE PROCEDURE GetUserData
@Username NVARCHAR(50)
AS
BEGIN
— Using parameterized query to prevent SQL injection
SELECT *
FROM Users
WHERE Username = @Username;
END
As you can see the same query is run, but the system knows that the input is a NVARCHAR(50) and treats it as such. Instead of treating the input like a string. In all honesty the second stored procedure is easier to read and understand as well, which makes maintenance much easier - remember, YOU might be the person who as to come behind and fix a bug in a year… Make life easier for FUTURE YOU!
Ok, now that we have the example of stored procedures being SQLi vulnerable if written incorrectly out of the way, how about the situation where you as the developer don’t know all the items that can be selected by the end user until runtime?
Well if you think about it, you DO know ALL the parameters that CAN be selected. The user can select any table or column in your database. So using that knowledge you can build a query you control based on the user inputs.
For the purposes of this discussion let’s say the user is asking for a query against one table named ABC with a total of 5 columns (named 1 through 5) and the column they selected is columns 2 and 4 where column 5 = foo The pseudocode looks something like this - keep in mind this is pseudocode and meant to be an example, don’t expect it to compile or work:
userColumnSelections will be where we store the user choices based on their inputs
userColumnSelections = [] #blank array
Initialize the query you are building. This could be a case statement based on user input to create any type of CRUD
string userQuery = “SELECT ”
userColumns is an array of what the user selected
for column in userColumns
case
column == 1
#User Wants column 1, so add it to our array of their choices
userColumnSelections.add(“column1”)
column == 2
#User Wants column 2, so add it to our array of their choices
userColumnSelections.add(“column2”)
column == 3
#User Wants column 3, so add it to our array of their choices
userColumnSelections.add(“column3”)
; column == 4
#User Wants column 4, so add it to our array of their choices
userColumnSelections.add(“column4”)
column == 5
#User Wants column 5, so add it to our array of their choices
userColumnSelections.add(“column5”)
end case
end for
At this point the userColumnSelections array should hold [“column2”, “column4”] - we have abstracted their choices into values that we as the developer control directly.
userQuery += join(”,” userColumnSelections) # Basically concatenate the two columns with a , between them
At this point the userQuery string is “SELECT column2, column4”
userQuery += ” FROM TABLE ABC WHERE ” #NOTE: if they selected other tables I could use logic to build up the JOINS, etc just like I did above. In this example we are only letting them select from table ABC
case
userWhereInput = 1
userQuery += “Column1 = @value”
userWhereInput = 2
userQuery += “Column2 = @value”
userWhereInput = 3
userQuery += “Column2 = @value”
userWhereInput = 4
userQuery += “Column4 = @value”
userWhereInput = 5
userQuery += “Column5 = @value”
end case
At this point the userQuery string reads “SELECT column2, column4 FROM ABC WHERE Column5 = @value” - @value is a placeholder!
NVARCHAR(50) @value = userSearchValue #userSearchValue = “Foo” in this example. The NVARCHAR is declaring the SQL datatype - real code syntax will vary depending upon the language.
Exec userQuery
The SQL engine will take the userSearchValue, smash it into the query and run it. But it will smash it in as the data type specified.
The difference here is, I made CHOICES based on user input to generate the query, but I DID NOT use the values directly in the query.
Yes it can be a pain for very large tables/column sets, etc, and yes there is tooling that can help with this, but understanding how to do it manually means you have a better understanding of what happens under the covers.