25 May 2015

SAVE Tax with new NPS provisions!!!

This article is about NPS (National Pension Scheme) or New Pensions System as some may know, which is a loose-ended replacement of the erstwhile Pension System prevelant in India. Although not a 100% replacement, this is the next best thing, and thus understanding it better is a good idea.

For all those trying hard to save Taxes while working their way around the Indian Income Tax system, the following may help in clarifying doubts / providing more avenues.

As per Income Tax provisions effective in 2015-16, investments in NPS can be done in three ways:
  • Employee contribution, under 80C
  • Direct investment in NPS (outside of your employer), under 80CCD(1B)
  • Employer contribution, under 80CCD(2)

 Let's look at these in more detail:

Employee contribution, under 80C

Any Indian (that has a PRAN) can invest in NPS by contributing to their NPS account by investing a minimum of Rs. 500/- per month or a minimum of Rs. 6000/- annually. This investment into NPS can be deducted from the person's taxable income (subject to a maximum capping of Rs. 1,50,000/- capping under 80C).

Although a good idea, this however, is generally futile since most people (who have been working for a while), have already reached their Rs. 1,50,000/- 80C limit via other means (for e.g. Life Insurance / ULIP / PF etc.). Then investing in NPS is although good in the long-term, however, it does not contribute to saving tax for the current financial year.


Direct investment in NPS (outside of your employer), under 80CCD(1B)

Any Indian (that has a PRAN) can invest directly into NPS, without the support of his / her employer. This facility has been available for a while, and is the oldest form of investing in NPS. Till recently, the caveat to this form of investment was that this did not have any Tax Exemption.

Although modified earlier, as of 2015-16, the Income Tax Provisions are such that investments in NPS (made directly) up to Rs. 50,000/-  can be deducted from the person's Taxable Income. To clarify, this doesn't mean that one can't invest more than Rs. 50k, but that only the first Rs. 50k of that amount can be deducted from his / her taxable income.

For e.g. Lets assume that Ms. Lata's has consumed Rs. 1,50,000/- 80C investment options (via Life Insurance / PF investments) her net taxable income is Rs. 3,75,000/-. Now lets assume that she invested Rs. 1,50,000/- directly to NPS (outside of her employer's assistance), then the next taxable income for her would become Rs. 3,25,000/- (i.e. 3,75,000 - 50,000). So although the entire sum of Rs. 1,50,000 was invested into NPS, only the first Rs. 50,000 was deducted from taxable income.


Employer contribution, under 80CCD(2)

The third and the most unclear & interesting section is the 80CCD(2) that allows an employee to save much more tax than was possible earlier.

Under this section, (apart from the above two clauses), an employee can request his / her employer to deduct a given sum from the monthly salary, and invest in NPS. This contribution (upto a maximum of 10% of Basic Pay) can be additionally deducted from the employee's taxable income, which in some cases can be a big boon to the net tax outflow in the financial year.

Example for all above sections

Lets take an example that elaborates all the sections given above:

Lets assume that Ms. Lata's Basic pay is Rs. 11,00,000 (11 lakh) and she has invested Rs. 1,00,000 in Life Insurance and Rs. 40,000 in ELSS Funds, as well as Rs. 10,000 in NPS (under section 80C). Further, she directly invested (outside of her employer's assistance) invested Rs. 50,000 in her NPS account (under section 80CCD(1B) ). Lastly, she requested her employer to invest Rs. 10,000/- per month in her NPS account under Section 80CCD(2).

Then her net taxable income would be as follows:

Taxable income = 11,00,000 
                   - Rs. 1,50,000 under 80C      - max (1.5 lakh)
                   - Rs.   50,000 under 80CCD(1B)- max (50k)
                   - Rs. 1,10,000 under 80CCD(2) - max (10% of Basic)
               = 11 lakh - 1.5 lakh - 0.5 lakh - 1.1 lakh
               = 7.9 lakh

This should clarify all doubts pertaining to investment in NPS for the financial year 2015-16.

17 May 2015

Basic OLAP Support in PostgreSQL

While reviewing an existing application, I thought it'd be worthwhile to review how good / bad PostgreSQL is in terms of OLAP. This (growing) post is going to be my (un)learning of how ready is PostgreSQL.

  1. Row Numbering
    1. Support: Yes. 
    2. Use: Row_Number() function numbers rows generated in a result-set.
    3. Example:

      SELECT
        row_number() OVER (ORDER BY marks DESC) AS rn,
        name
      FROM x;
    4. Review: Some databases have different variants that accomplish this (for e.g. Oracle has a pseudo column called ROWNUM), but PostgreSQL fully supports the SQL Compliant syntax.
  2. Rank()
    1. Support: Yes. 
    2. Use: Rank() and Dense_Rank() functions number the rank of the compared item. 
    3. Example:

      SELECT 
        rank() OVER (ORDER BY marks DESC) AS rn,
        dense_
      rank() OVER (ORDER BY marks DESC) AS drn,
        
      name

      FROM x;
       
    4. Review: Its useful and fully supported.
  3. Window Clause
    1. Support 
      1. OVER (PARTITION BY): Yes
      2. OVER (ORDER BY): Yes
      3. OVER (RANGE): Yes
    2. Use:  Read more here.
    3. Example:  
    4. Review: These are extremely helpful for people serious about data-extraction / reporting and fully supported.
  4. NTile
    1. Support: . Yes
    2. UseNtile().
    3. Example:

      SELECT 
        ntile(4) OVER (ORDER BY marks DESC) AS quartile,

        ntile(10) OVER (ORDER BY marks DESC) AS decile,

        ntile(100) OVER (ORDER BY marks DESC) AS percentile,

        
      name

      FROM x;
       
    4. Review: Versatile and fully supported.
  5. Nested OLAP Aggregations
    1. Support: No
      1. But doable with alternative SQL? : Yes
        1. Is that as Performant? : Mostly No
    2. Description: Allow something like

      SELECT
        subject,
        AVG(SUM(marks) GROUP BY class)
      FROM marks
      GROUP BY subject;
    3. Alternative:  This could be done with Sub-Selects like this:

      SELECT
        subject,
        AVG(sum_marks) AS avg
      FROM (
        SELECT
         subject
         class,
         SUM(marks) AS sum_marks   
        FROM marks   
        GROUP BY subject, class
        ) mrk
      GROUP BY subject;
    4. Review: In the two examples we are trying to calculate the Per-Subject-Average of (Total marks obtained in different classes). Although PostgreSQL doesn't support this form of nested-aggregates, it clearly is a neat form of doing things. The alternative, acceptably looks like a kludge, and it would be a nice to have feature. 
  6. GROUPING SETS
    1. SupportYes (in 9.5)
    2. Alternative:  This could be alternatively done with UNION ALL like this:
      SELECT
       SubjectID,
       NULL AS StudentID,
       AVG(marks)
      FROM marksGROUP BY SubjectID
      UNION ALL

      SELECT
       NULL AS SubjectID,
       StudentID,
       AVG(marks)
      FROM marksGROUP BY StudentID;
    3. Review: Popular databases (Oracle / MSSQL) support this well. PostgreSQL has had this has on the ToDo list from at least a decade ! Looking at the alternative, one can see that this is not just lengthy (and repetitive .. thus error-prone), but also non-performant (simply because it requires multiple-runs of the same data-set).
    4. History:
      1. Already in PostgreSQL TODO list
      2. Discussions started (at least) way back in 2003.
      3. Patch:
        1. 2008 patch that didn't make it.
        2. 2014 patch was heavily in discussion since and finally just got through to PostgreSQL 9.5.
  7. ROLLUP
    1. Description: An obvious extension to GROUPING BY (explained above), ROLLUP could be explained with a simple example:

       GROUP BY ROLLUP (Year, SubjectID, StudentID)

      is equivalent to

       GROUP BY GROUPING SETS
       (Year, SubjectID, StudentID)
       (Year, SubjectID)(Year)
       ();
    2. Support: Yes (in 9.5)
    3. Alternative:  This could be alternatively done with CTEs.

      WITH x AS (
        SELECT Year, SubjectID, StudentID, marks
        FROM marks
        WHERE passed
          AND NOT inactive
      )
      SELECT *
        FROM x

      UNION ALL

      SELECT
          Year, SubjectID, StudentID, AVG(marks)
        FROM x
        GROUP BY Year, SubjectID, StudentID

      UNION ALL

      SELECT
          Year, SubjectID, NULL AS StudentID, AVG(marks)
        FROM x
        GROUP BY Year, SubjectID

      UNION ALL

      SELECT
          Year, NULL AS SubjectID, NULL AS StudentID,
      AVG(marks)
        FROM marks
        GROUP BY Year;
    4. Review: ROLLUPs are ideal to generate things like Sub-totals, which at times form key performance factors when generating large Reports. The alternative essentially uses a CTE, which is subsequently used to calculate subtotals and totals. For multiple-reasons, this is sub-optimal and can be sped up, if only for in-built support. Besides, the alternative is lengthy & repetitive (thus error-prone).
    5. History:
      1. Discussions started (at least) way back in 2003.
      2. Patches submitted
        1. The 2010 patch seemingly didn't make it.
        2. The 2014 attempt finally got through.
  8. CUBE
    1. SupportYes (in 9.5)
    2. Description: Just like ROLLUP (was an extension of GROUPING SETS), CUBEs are an extension of ROLLUP (and thereby GROUPING SETS) and could be explained with the following example:

       GROUP BY CUBE (Year, SubjectID, StudentID)
      is equivalent to

       GROUP BY GROUPING SETS
        (Year, SubjectID, StudentID)
        (Year, SubjectID)
        (Year, StudentID)
        (Year)
        (SubjectID, StudentID)
        (SubjectID)
        (StudentID)
        ();
    3. Review: The alternative (not provided for obvious reasons) is not just lengthy & repetitive (thus error-prone) but primarily not as performant as is otherwise possible.
  9. MERGE INTO / UPSERT
    1. SupportYes (in 9.5)
    2. Doable with alternative SQL? : Yes (for 9.4 and below)
      1. Is the alternative as Performant?
        1. No: This is because the alternative (URL given below) is a BEGIN/EXCEPTION based solution which is (time-wise) costly and an in-built support would certainly be faster.
    3. Description: For those new to the complexity of MERGE (or UPSERT) please read this first.

      TLDR: In the face of
      Concurrent Use, MERGE is difficult when a trying to balance Performance vs Integrity.

      Unlike some other Database engines (that are sometimes okay with trading-off Integrity when it conflicts with Performance), PostgreSQL consistently prioritizes Data Integrity. The 'best' solution seems to have taken longer than expected, but considering that when a complicated open-source development model needs to coherently agree upon core feature additions, it really takes a few falling stars to get this piece-of-code in, with most people in support of it.
    4. Example (SQL to create scenario + below SQL taken from here)
      MERGE INTO bonuses B
      USING (
       SELECT employee_id, salary
       FROM employee
       WHERE dept_no =20) E
      ON (B.employee_id = E.employee_id)
      WHEN MATCHED THEN
       UPDATE SET B.bonus = E.salary * 0.1
      WHEN NOT MATCHED THEN
       INSERT (B.employee_id, B.bonus)
       VALUES (E.employee_id, E.salary * 0.05);
    5. Alternative: The PostgreSQL documentation mentions one recommended way of doing UPSERT / MERGE here. But again, this is non-performant and 9.5 based support for INSERT .. ON CONFLICT (a.k.a. UPSERT).
    6. History
      1. MySQL / Oracle / MSSQL support this very well.
      2. Long-pending requirement as per Wiki and now finally has made through!

16 May 2015

Postgres finally has CUBE / ROLLUP / GROUPING SETS !

Finally !

A *much* awaited feature, this attempt at adding the GROUPING SETS / ROLLUP / CUBE feature to PostgreSQL has been in the works for about a year (besides the so many in the past decade and a half that didn't get through), and thankfully this has finally got the approval of the powers that be, so the upcoming Postgres 9.5 would finally have this long pending SQL feature.

MSSQL and Oracle have had this for a while and then its time that PostgreSQL sport this as well. A big boon for Report generating SQLs this feature basically makes (what was earlier possible with lots of unmanageable hack of SQL), now possible with much cleaner code, and with much better (at times single pass) performance.

Read here to know more about OLAP support in PostgreSQL.

Thanks a ton Andrew Gierth and Atri Sharma and so many others who directly or indirectly assisted in getting this patch out of the door!

Andrew / Atri... take that long pending break... one look at that mail thread and it seems you deserve it :D !

Find Database DNS / Endpoint via SQL

How to get Database identifier using SQL Often there is a need for client programs to find "where am I logged into?". This blog po...