That Guy From Delhi

20 Nov 2015

Reverse Port Forwarding and how !

Recently had to work on a PoC that required spinning off Docker instances with custom configurations. The problem wasn't as much with the Docker aspect, which is probably up for another story sometime, as much as the fact that as luck had it, we couldn't commission a separate machine for this PoC, and I eventually had to do the entire development inside a 2G VirtualBox VM on my 8G Laptop.

The tricky part was that we had to submit the URL to the PoC Landing Page even before it could be ported to a full-fledged VM/server.

So effectively, we:

Submitted a URL (on ServerA) running Apache, but no space to host a 40G VM.
Had a Laptop where the VM was developed, but can't stay online since its not always connected
Had another server (ServerB) where the VM needs to be ported, to be always up for PoC evaluation
ServerA was in the US, whereas, ServerB (& Laptop) are half-way across the world in India.

Separately, the VM was an Ubuntu installation that had the following:

A running NodeJS (Bootstrap / Express combo) serving the PoC GUI
Separately it hosted the docker runtime that spun off Docker instances that had their own propietary application + Tomcat running on port 8080, all of which eventually supposed to be visible company-wide
Thirdly, it had a PostgreSQL 9.3 database that served as a backend to all the running docker-based app instances

To connect ServerA with the PoC hosted on ServerB we tried the following:

Creating VirtualHost entries on the ServerA that pointed to corresponding ports on the VM

This failed because the VM was running inside a NATted configuration and thus was invisible to ServerA

Then we tried an alternate solution, wherein Apache->VirtualHost entries on ServerA were pointed to port 80 on ServerB. This required port-forwarding port 80 on ServerB to port 3000 (NodeJS) on ServerB. Although for this we used NetSH, this failed for various reasons, some of which we could identify and some we couldn't. For e.g. we installed the ipv4 module for NetSH, also tried installing ipv6, as well as, enabling Windows Firewall for NetSH to work (contrary to popular belief that Windows Firewall might be the reason for things getting blocked)
We finally settled down to running persistent SSH connections directly from the VM to ServerA (thereby by-passing all Windows Firewall issues) and setting up Remote Port-Forward such that remote ports could connect to Node Server inside the VM

This too initially didn't work as expected. To identify the cause we:

isabled SELinux on ServerB, then ServerA (although that didn't make sense) as well as on the VM to no avail
Then we disabled IPTables on all the relevant machines and that still didn't help
Finally we realised that the Remote Port-Forwarding though working well, was getting attached to the localhost interface on ServerA. Which means that although 127.0.0.1:80 on ServerA was working as expected, 192.168.1.1:80 on the same server immediately gave a "connection failed". It took a while to realise that though I was logging in trans-atlantic, the immediate 'Connection Failed' was an issue with ServerA and not with this side of the line.
What complicated things post that was that the regular -g option was insufficient. It required setting the GatewayPorts on the SSHD (with a sshd restart) to work as expected.

Eventually, after a few hours of transatlantic jugglery, finally got a working URL that looked something like this... What a day !

12 Nov 2015

Pigz - Parallel GZip

For a while, I've been frustrated with the fact that GZip was unable to consume idle CPUs on my laptop. Recently read about PIGZ (a drop-in GPL replacement) that implicitly consumes all CPUs.

Pronounced Pig-Zee, PIGZ is a drop-in replacement for GZIP and automatically consumes all CPUs of the host machine in order to get the compression completed much faster.

OLD: tar -cf - sourcefolder | gzip > sourcefolder.tar.gz

NEW: tar -cf - sourcefolder | pigz > sourcefolder.tar.gz

Enjoy!

11 Jun 2015

Using Pi as an home-based Media Server

This is among a series of articles on my experience with the Pi.

This article is about using a Pi as a primitive low-end File-Server for your home network:

Expectations:

Torrent-Server

Download Torrents
Store on a File-Share

Windows Share

Serve the File-Share as a Windows Share Drive
Allow Read / Write to this Windows Share Drive

Use File-Share as Media-Server

Using any Smartphone / Laptop

Play Movies using VLC
View all Photos / Home-Videos

Effectively:

Always on

Always accepting new Torrent requests
Instantly start downloading

In Real-time

Allow user's to view torrent download status

Use any UPnP Phone App to play Video content over WiFi on a SmartPhone
Use VLC (Network Streaming) to play any Video over WiFi on Laptop / Desktop
Use Windows Share Drive and view all Photos / Home-Videos as needed

How-To:

On Server

Install Torrent-Daemon
Configure Torrent-Daemon to listen on RPC requests

Here's the howto for that

Configure Samba

Set the Download folder to be shared

On Windows

Install Transmission-GUI-Remote

Desktop / Smartphone

Configure GUI to use the RPC based Torrent-Daemon server
Make this application the default for .torrent & magnet files

On Linux

Install Transmission-Remote

sudo apt-get install transmission-remote

Configure that to use Daemon (instead of downloading directly with transmission-cli)

Pros/Cons:

Pros

Once torrent download has begun, client can disconnect / shutdown client computer
Server continues to get torrents, after a restart
miniDLNA serving speed pretty decent

Watching a movie (over WiFi) on VLC

CPU ratio barely 0.01 which is pretty decent

Should be able to easily serve a small army :)

If no other IO is happening
If WiFi isn't the bottleneck

Cons

Storage on Pi needs to be managed from time-to-time

Currently is highly adhoc based

Truly taking RAID concept to heart!!

Have 6 Pen Drives (ranging from 4GB to 16GB)

8GBs are INR ~170 ( $3 )
All connected via 3 USB Hubs (both USB 2.0 and USB 3.0)

One Powered, two non-powered hubs

Most on Btrfs

Pretty stable, surprising that still get 1+ MBps with Pi's CPU

Some VFAT fs since have need of moving stuff off of Pi to a Windows Laptop
All mounted under /disk

Very temporary arrangement

Ideally am looking for a LVM solution (which allows me to remove / add pen-drives on the fly and thus can be called as one) that actually is suited for this purpose.

But miniDLNA picks up Photo / Audio / Video pretty well from all different mounted folders

Download speed limitations

Internet Speed = 2 MBps
Daemon Download speed capping = 1Mbps

But PI never reaches it :D !

Just imagine the poor configuration

Pi + Btrfs + USB 2.0 + 3 USB Chain + Unpowered Hub

Still consistent at 450kbps! Impressive!

Lack of Transmission-Daemon configuration tool means all configuration has to be done via black-screen configuration files

Careful configuration

Give all access to 'all' users only if you're sure that all users are going to be careful

25 May 2015

SAVE Tax with new NPS provisions!!!

This article is about NPS (National Pension Scheme) or New Pensions System as some may know, which is a loose-ended replacement of the erstwhile Pension System prevelant in India. Although not a 100% replacement, this is the next best thing, and thus understanding it better is a good idea.

For all those trying hard to save Taxes while working their way around the Indian Income Tax system, the following may help in clarifying doubts / providing more avenues.

As per Income Tax provisions effective in 2015-16, investments in NPS can be done in three ways:

Employee contribution, under 80C
Direct investment in NPS (outside of your employer), under 80CCD(1B)
Employer contribution, under 80CCD(2)

Let's look at these in more detail:

Employee contribution, under 80C

Any Indian (that has a PRAN) can invest in NPS by contributing to their NPS account by investing a minimum of Rs. 500/- per month or a minimum of Rs. 6000/- annually. This investment into NPS can be deducted from the person's taxable income (subject to a maximum capping of Rs. 1,50,000/- capping under 80C).

Although a good idea, this however, is generally futile since most people (who have been working for a while), have already reached their Rs. 1,50,000/- 80C limit via other means (for e.g. Life Insurance / ULIP / PF etc.). Then investing in NPS is although good in the long-term, however, it does not contribute to saving tax for the current financial year.

Direct investment in NPS (outside of your employer), under 80CCD(1B)

Any Indian (that has a PRAN) can invest directly into NPS, without the support of his / her employer. This facility has been available for a while, and is the oldest form of investing in NPS. Till recently, the caveat to this form of investment was that this did not have any Tax Exemption.

Although modified earlier, as of 2015-16, the Income Tax Provisions are such that investments in NPS (made directly) up to Rs. 50,000/- can be deducted from the person's Taxable Income. To clarify, this doesn't mean that one can't invest more than Rs. 50k, but that only the first Rs. 50k of that amount can be deducted from his / her taxable income.

For e.g. Lets assume that Ms. Lata's has consumed Rs. 1,50,000/- 80C investment options (via Life Insurance / PF investments) her net taxable income is Rs. 3,75,000/-. Now lets assume that she invested Rs. 1,50,000/- directly to NPS (outside of her employer's assistance), then the next taxable income for her would become Rs. 3,25,000/- (i.e. 3,75,000 - 50,000). So although the entire sum of Rs. 1,50,000 was invested into NPS, only the first Rs. 50,000 was deducted from taxable income.

Employer contribution, under 80CCD(2)

The third and the most unclear & interesting section is the 80CCD(2) that allows an employee to save much more tax than was possible earlier.

Under this section, (apart from the above two clauses), an employee can request his / her employer to deduct a given sum from the monthly salary, and invest in NPS. This contribution (upto a maximum of 10% of Basic Pay) can be additionally deducted from the employee's taxable income, which in some cases can be a big boon to the net tax outflow in the financial year.

Example for all above sections

Lets take an example that elaborates all the sections given above:

Lets assume that Ms. Lata's Basic pay is Rs. 11,00,000 (11 lakh) and she has invested Rs. 1,00,000 in Life Insurance and Rs. 40,000 in ELSS Funds, as well as Rs. 10,000 in NPS (under section 80C). Further, she directly invested (outside of her employer's assistance) invested Rs. 50,000 in her NPS account (under section 80CCD(1B) ). Lastly, she requested her employer to invest Rs. 10,000/- per month in her NPS account under Section 80CCD(2).

Then her net taxable income would be as follows:

Taxable income = 11,00,000
                   - Rs. 1,50,000 under 80C      - max (1.5 lakh)
                   - Rs.   50,000 under 80CCD(1B)- max (50k)
                   - Rs. 1,10,000 under 80CCD(2) - max (10% of Basic)
   = 11 lakh - 1.5 lakh - 0.5 lakh - 1.1 lakh
     = 7.9 lakh

This should clarify all doubts pertaining to investment in NPS for the financial year 2015-16.

17 May 2015

Basic OLAP Support in PostgreSQL

While reviewing an existing application, I thought it'd be worthwhile to review how good / bad PostgreSQL is in terms of OLAP. This (growing) post is going to be my (un)learning of how ready is PostgreSQL.

Row Numbering

Support: Yes.
Use: Row_Number() function numbers rows generated in a result-set.
Example:

SELECT
row_number() OVER (ORDER BY marks DESC) AS rn,
name
FROM x;
Review: Some databases have different variants that accomplish this (for e.g. Oracle has a pseudo column called ROWNUM), but PostgreSQL fully supports the SQL Compliant syntax.

Rank()

Support: Yes.
Use: Rank() and Dense_Rank() functions number the rank of the compared item.
Example:

SELECT
rank() OVER (ORDER BY marks DESC) AS rn,
dense_rank() OVER (ORDER BY marks DESC) AS drn,
name
FROM x;
Review: Its useful and fully supported.

Window Clause

Support

OVER (PARTITION BY): Yes
OVER (ORDER BY): Yes
OVER (RANGE): Yes

Use: Read more here.
Example:
Review: These are extremely helpful for people serious about data-extraction / reporting and fully supported.

NTile

Support: . Yes
Use: Ntile().
Example:

SELECT
ntile(4) OVER (ORDER BY marks DESC) AS quartile,
ntile(10) OVER (ORDER BY marks DESC) AS decile,
ntile(100) OVER (ORDER BY marks DESC) AS percentile,
name
FROM x;
Review: Versatile and fully supported.

Nested OLAP Aggregations

Support: No

But doable with alternative SQL? : Yes

Is that as Performant? : Mostly No

Description: Allow something like

SELECT
subject,
AVG(SUM(marks) GROUP BY class)
FROM marks
GROUP BY subject;
Alternative: This could be done with Sub-Selects like this:

SELECT
subject,
AVG(sum_marks) AS avg
FROM (
SELECT
subject
class,
SUM(marks) AS sum_marks
FROM marks
GROUP BY subject, class
) mrk
GROUP BY subject;
Review: In the two examples we are trying to calculate the Per-Subject-Average of (Total marks obtained in different classes). Although PostgreSQL doesn't support this form of nested-aggregates, it clearly is a neat form of doing things. The alternative, acceptably looks like a kludge, and it would be a nice to have feature.

GROUPING SETS

Support: Yes (in 9.5)

Alternative: This could be alternatively done with UNION ALL like this:
SELECT SubjectID,
NULL AS StudentID,
AVG(marks)FROM marksGROUP BY SubjectID
UNION ALL

SELECT NULL AS SubjectID,
StudentID,
AVG(marks)FROM marksGROUP BY StudentID;
Review: Popular databases (Oracle / MSSQL) support this well. PostgreSQL has had this has on the ToDo list from at least a decade ! Looking at the alternative, one can see that this is not just lengthy (and repetitive .. thus error-prone), but also non-performant (simply because it requires multiple-runs of the same data-set).
History:

Already in PostgreSQL TODO list
Discussions started (at least) way back in 2003.
Patch:

A 2008 patch that didn't make it.
A 2014 patch was heavily in discussion since and finally just got through to PostgreSQL 9.5.

ROLLUP

Description: An obvious extension to GROUPING BY (explained above), ROLLUP could be explained with a simple example:

GROUP BY ROLLUP (Year, SubjectID, StudentID)

is equivalent to

GROUP BY GROUPING SETS
(Year, SubjectID, StudentID)
(Year, SubjectID)(Year)
();
Support: Yes (in 9.5)
Alternative: This could be alternatively done with CTEs.

WITH x AS (
SELECT Year, SubjectID, StudentID, marks
FROM marks
WHERE passed
AND NOT inactive
)
SELECT *
FROM x

UNION ALL

SELECT
Year, SubjectID, StudentID, AVG(marks)
FROM x
GROUP BY Year, SubjectID, StudentID

UNION ALL

SELECT
Year, SubjectID, NULL AS StudentID, AVG(marks)
FROM x
GROUP BY Year, SubjectID

UNION ALL

SELECT
Year, NULL AS SubjectID, NULL AS StudentID,
AVG(marks)
FROM marks
GROUP BY Year;
Review: ROLLUPs are ideal to generate things like Sub-totals, which at times form key performance factors when generating large Reports. The alternative essentially uses a CTE, which is subsequently used to calculate subtotals and totals. For multiple-reasons, this is sub-optimal and can be sped up, if only for in-built support. Besides, the alternative is lengthy & repetitive (thus error-prone).
History:

Discussions started (at least) way back in 2003.
Patches submitted

The 2010 patch seemingly didn't make it.
The 2014 attempt finally got through.

CUBE

Support: Yes (in 9.5)
Description: Just like ROLLUP (was an extension of GROUPING SETS), CUBEs are an extension of ROLLUP (and thereby GROUPING SETS) and could be explained with the following example:

GROUP BY CUBE (Year, SubjectID, StudentID)
is equivalent to

GROUP BY GROUPING SETS
(Year, SubjectID, StudentID)
(Year, SubjectID)
(Year, StudentID)
(Year)
(SubjectID, StudentID)
(SubjectID)
(StudentID)
();
Review: The alternative (not provided for obvious reasons) is not just lengthy & repetitive (thus error-prone) but primarily not as performant as is otherwise possible.

MERGE INTO / UPSERT

Support: Yes (in 9.5)
Doable with alternative SQL? : Yes (for 9.4 and below)

Is the alternative as Performant?

No: This is because the alternative (URL given below) is a BEGIN/EXCEPTION based solution which is (time-wise) costly and an in-built support would certainly be faster.

Description: For those new to the complexity of MERGE (or UPSERT) please read this first.

TLDR: In the face of Concurrent Use, MERGE is difficult when a trying to balance Performance vs Integrity.

Unlike some other Database engines (that are sometimes okay with trading-off Integrity when it conflicts with Performance), PostgreSQL consistently prioritizes Data Integrity. The 'best' solution seems to have taken longer than expected, but considering that when a complicated open-source development model needs to coherently agree upon core feature additions, it really takes a few falling stars to get this piece-of-code in, with most people in support of it.
Example (SQL to create scenario + below SQL taken from here) MERGE INTO bonuses B

USING (

SELECT employee_id, salary

FROM employee

WHERE dept_no =20) E

ON (B.employee_id = E.employee_id)

WHEN MATCHED THEN

UPDATE SET B.bonus = E.salary * 0.1

WHEN NOT MATCHED THEN

INSERT (B.employee_id, B.bonus)
VALUES (E.employee_id, E.salary * 0.05);
Alternative: The PostgreSQL documentation mentions one recommended way of doing UPSERT / MERGE here. But again, this is non-performant and 9.5 based support for INSERT .. ON CONFLICT (a.k.a. UPSERT).
History:

MySQL / Oracle / MSSQL support this very well.
Long-pending requirement as per Wiki and now finally has made through!

16 May 2015

Postgres finally has CUBE / ROLLUP / GROUPING SETS !

Finally !

A *much* awaited feature, this attempt at adding the GROUPING SETS / ROLLUP / CUBE feature to PostgreSQL has been in the works for about a year (besides the so many in the past decade and a half that didn't get through), and thankfully this has finally got the approval of the powers that be, so the upcoming Postgres 9.5 would finally have this long pending SQL feature.

MSSQL and Oracle have had this for a while and then its time that PostgreSQL sport this as well. A big boon for Report generating SQLs this feature basically makes (what was earlier possible with lots of unmanageable hack of SQL), now possible with much cleaner code, and with much better (at times single pass) performance.

Read here to know more about OLAP support in PostgreSQL.

Thanks a ton Andrew Gierth and Atri Sharma and so many others who directly or indirectly assisted in getting this patch out of the door!

Andrew / Atri... take that long pending break... one look at that mail thread and it seems you deserve it :D !

21 Dec 2014

Why's my database suddenly so big?

In a Product based company, at times DB Developers don't get direct access to production boxes. Instead, an IT resource ends up managing a large swathe of Web / DB boxes. Since such a resource generally has a large field of operation, at times they need quick steps to identify the sudden high-disk usage.

In such a scenario (where Production is silo-ed out of DB Developers), correct triaging of a Database disk-usage spike is especially helpful, because a good bug-report is at times the only help that'd ensure a 1-iteration resolution.

Recently, one of our Production Database boxes hit a Nagios disk-alert and an IT personnel wanted to identify who (and specifically what) was causing this sudden spike.

From the looks of it, it clearly was a time-consuming + space-consuming Job running on the box and not much that could have been done (sans terminating it), but the following steps could have helped to prepare a bug-report of-sorts for the DB developer to identify / correct it before this happens again:

Isolate whether the disk increase is because of postgres. i.e. Compare the following two disk-usage outputs:

df -h
du -h --max-depth=1 /var/lib/postgresql/

On the psql prompt

SELECT
datname AS db_name,
pg_size_pretty(pg_database_size(oid)) AS db_size
FROM pg_database
ORDER BY pg_database_size(oid) DESC
LIMIT 10;
Connect to the Database. (Assuming that there is one large database on the system, and its name is X. In the above output, it'd be the first row, under db_name):
\c X
From Wiki: SELECT nspname || '.' || relname AS "relation",
    pg_size_pretty(pg_total_relation_size(C.oid)) AS "total_size"
FROM pg_class C
LEFT JOIN pg_namespace N ON (N.oid = C.relnamespace)
WHERE nspname NOT IN ('pg_catalog', 'information_schema')
    AND C.relkind <> 'i'
    AND nspname !~ '^pg_toast'
ORDER BY pg_total_relation_size(C.oid) DESC
LIMIT 10; SELECT NOW();

A minor addition, but sometimes, a simple thing such as appending the current Server-Timestamp is 'very' beneficial for accurate triaging. For e.g. If a given table is known to grow very big, N minutes into processing, (assuming we have at least basic logging in place) a Developer can easily identify which method to focus on.
In most cases, the query above (2.4) would give the list of tables that one is looking for:

In most cases, it turns out that the Job is expected to consume processing disk-space, and if so, the DB developer probably needs to request IT for mandatory empty-disk space for each night's periodic processing. That's a simple solution and we're done.
Alternatively, its possible that the Job is (unexpectedly) creating some very large tables (and probably removing it post processing). Such cases, may need optimization, which probably got aggravated purely out of DB growth. Possible, but again, we're clear about how to resolve it.

However, at times the tables returned doesn't cover 'all' that could consume disk-space:

The list of tables generated in Step 2.4 above covers:

(Regular) Tables
Temporary Tables
Unlogged Tables

However, it does not cover tables created within another session's transaction. More detailing on that given below:

Since Temporary Tables (within a Transaction) are NOT visible from another session, Query 2.4 when run by an administrator in another psql session, would not be able to show the temporary table.

To find disk-consuming tables, an alternate approach may yield better results:

$ cd /var/lib/postgres/9.2/data/base/
$ ls -lR | awk '$5 > 1000000' | awk '{print "File: " $9 ", Size: " $5}' | sed 's/\./ part-/g'
File: t22_3586845404, Size: 819200000

File: 3586846552, Size: 630161408
File: t24_3586846545, Size: 1073741824
File: t24_3586846545 part-1, Size: 1073741824
File: t24_3586846545 part-2, Size: 1073741824
File: t24_3586846545 part-3, Size: 559702016

Lets analyse this Shell command:

When run from the data/base folder of PostgreSQL it shows which file uses most disk-space across *all* databases. If you already know which database to focus on, you may want to go inside data/base/xxxx folder and run this Shell command there instead.
In the output we see three things:

File 3586846552 is a large file pointing to a (non-temporary) table
File t22_3586845404 is a large file, points to a *temporary* table but is less than 1Gb size
File t24_3586846545 is a large file, also points to a *temporary* table and is between 3Gb and 4Gb in size, (basically because each file part is a 1Gb volume) and therefore is a good contender to be researched further.

So lets investigate file t24_3586846545 further.

From a psql prompt:
postgres=# \x
Expanded display is on.
postgres=#
WITH x AS (
SELECT trim('t24_3586846545')::TEXT AS folder_name
),
y AS (
SELECT
    CASE
      WHEN position('_' in folder_name) = 0
        THEN folder_name::BIGINT
      ELSE substring(folder_name
        FROM (position('_' in folder_name) + 1))::BIGINT
    END AS oid
FROM x
),
z AS (
SELECT
    row_to_json(psa.*)::TEXT AS pg_stat_activity_Dump,
    query AS latest_successful_query,
    array_agg(mode) AS modes,
    psa.pid
FROM y, pg_locks join pg_stat_activity psa
    USING (pid)
WHERE relation = oid
    AND granted
GROUP BY psa.pid,row_to_json(psa.*)::TEXT, query
)
SELECT *
FROM z

UNION ALL

SELECT
    'Doesnt look like this folder (' ||
    (SELECT folder_name FROM x) ||
    ') stores data for another session''s transaction'::TEXT,
    NULL, NULL, NULL
FROM z
HAVING COUNT(*) = 0;
-[ RECORD 1 ]-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pg_stat_activity_dump   | {"datid":"2946058308","datname":"rt_ra","pid":16828,"usesysid":"16384","usename":"rms","application_name":"psql","client_addr":null,"client_hostname":null,"client_port":-1,"backend_start":"2014-12-20 18:52:47.702365+05:30","xact_start":"2014-12-20 20:01:12.630708+05:30","query_start":"2014-12-20 20:53:41.733738+05:30","state_change":"2014-12-20 20:53:41.734325+05:30","waiting":false,"state":"idle in transaction","query":"select n.nspname from pg_class c join pg_namespace n on n.oid=c.relnamespace\nwhere c.relname ='a' and n.nspname like 'pg_temp%';"}
latest_successful_query | select n.nspname from pg_class c join pg_namespace n on n.oid=c.relnamespace
                        | where c.relname ='a' and n.nspname like 'pg_temp%';
modes                   | {AccessExclusiveLock,RowExclusiveLock}
pid                     | 16828

postgres=#

The output of this SQL is probably going to be at least of some help, if the rogue table, is a table within another Transaction. I'll try to disect what this query is doing:

The first column is basically the entire pg_stat_activity row mangled into a JSON object.
The pid is the PID of the Postgres process that is serving the Connection that currently has a Lock on the table that identifies t24_3586846545. This way, we can know more about the connection that created this (large) table. (Please be very clear, that unless you know what you're doing, you shouldn't *kill* any postgres process from bash).
The columns 'last_successful_query' and 'modes', are probably uninteresting to the Admin, but may be of big help to the Developer.

(Obviously, a hands-on IT admin needs to replace the string (t24_3586846545) in the above SQL, with the file-name that (s)he gets the most number of times, when executing the previous shell-command).

Happy triaging :) !