A duplicate attribute key has been found during processing (revisited)
SQL Server 2005-2012
This blog post is my 100th post here and I’d like to say a big thanks to all the readers and supporters of this blog. Reason enough to concentrate on a very popular topic from my past blog posts.
Back in 2009 I wrote about the typical cases for the duplicate attribute key error in SSAS. From the response I got to this post this is still a very common error message causing confusion. Reason enough to revisit this topic and complement my last post with three new variations. Thanks for all the comments, as two of the cases are taken directly from the comments on my previous post.
Reason 4 (likely): This is a variation of reason 1 from my previous post . Reason 1 mentioned a NULL value in the key column being converted to a blank string which might conflict with another row that really has a blank string value. The same situation happens (as one of the comments mentioned) with a NULL value being converted to 0 for a numeric column.
Solution (identical as in reason 1): Try avoiding those NULL values in your data source (for example by using a DSV query and the T-SQL coalesce-function). When your source data is a data warehouse it’s also a good practice to avoid null values as they complicate the queries to the data warehouse.
Reason 5 (not very likely): The collation of the SQL Server relational database is “more detailed” than the collation on the SSAS side for an attribute’s key column. For example, SQL Server uses a case sensitive collation (like SQL_Latin1_General_CP1250_CS_AS) while SSAS uses a case in-sensitive collation. This causes SQL Server to treat a and A as two different values, while SSAS treats them as the same value (giving the duplicate key error). The same thing happens with an accent sensitive collation.
Solution: If you can, use the same collation type on both sides. If not, you could access the database table using a view that changes the collation or change the collation for the SSAS key column(s).
Reason 6 (not very likely): Your using the translation feature of the Enterprise Edition and for a certain attribute key you’re getting more than one translation.
Solution: Make sure that the translations are consistent (from the view-point of the attribute key)
Long explanation for reason 4
In order to produce this error, I used a table with a numeric (int) column. The following rows are present in the table:
As you can see, there is one row with a value of 0 and one row with a value of NULL. I created an attribute that simply uses this column as attribute key (no definition for attribute name or value).
When you try to process the dimension, you get the duplicate key error:
The error message reads:
Errors in the OLAP storage engine: A duplicate attribute key has been found when processing: Table: 'dbo_DimTestNull', Column: 'Key', Value: '0'. The attribute is 'Key'.
If you check the details for the key column you see that NullProcessing is set to Automatic (which is identical to ZeroOrBlank in this case, see long explanation here ).
When reading the dimension, SSAS runs a SQL query similar to this one:
- SELECT DISTINCT [dbo_DimTestNull].[Key] AS [dbo_DimTestNullKey0_0] FROM [dbo].[DimTestNull] AS [dbo_DimTestNull]
Now, with NULL being converted to the number 0 this results in two rows with the same key for a column that is expected to contain distinct values. This causes the error.
As with reason 1, having NULL values in the data warehouse is not a best practice as they cause problems not only with SSAS processing. So, it’s best to replace the NULL values in ETL and to define all colums as NOT NULL.
A tricky thing with this error is, that it might not occur on a process update. So, let’s assume your dimension contains only a row with the value 0 and processes fine. Then, sometimes later, a row with a NULL value appears. Since you’re still doing a process update, the error will not occur! After a while you try to full process the dimension and now you see the error message from above. This can be very confusion especially if the reason for the full process is a major change to the dimension (so you suspect this change to be the reason for the error).
Long explanation for reason 5
Let’s assume we have a very simple source table for our dimension:
I’d like to build a dimension based on this table. I could use the column Key as the dimension Key. Next I use the column AttributeKey as the source for my new attribute.
SSAS uses this query to get the values for the attribute’s key:
- SELECT DISTINCT [dbo_DimTestCase].[AttributeKey] AS [dbo_DimTestCaseAttributeKey0_0] FROM [dbo].[DimTestCase] AS [dbo_DimTestCase]
If you have a case insensitive collation (like SQL_Latin1_General_Cp1_CI_AS, CI=case insensitive) on this column in the relational database, the result contains only one row (in my case with the lower a). However, if you have a case-sensitive collation (like SQL_Latin1_General_Cp1_CS_AS, CS=case sensitive), you’ll get two rows like this:
You can see the collation for the column in the column properties in SQL Server Management Studio:
Since the CS in the collation means case sensitive, SQL Server treats the entries a and A as two different values.
Now, let’s have a look at the SSAS collation. You can find this setting in the attribute’s key column:
A blank entry (“none”) means the default collation. If you click on the ellipsis, you can see the details of the collation and make modifications for this single column:
In my case, the default collation of the SSAS database is set to Latin1_General, case-insensitive.
So, if the collation on the SQL Server side is case sensitive while the collation on the SSAS side is not, we’re in a similar situation as for reason 4: We’re getting two rows a, A while SSAS only sees one value (a), thus resulting in a duplicate key error.
Basically, to fix this error you have two options:
- make the collation on the SQL Server side case insensitive
- make the collation on the SSAS side case sensitive
In detail, you may go for one of the following changes:
- change the collation for the column in the relational database to a case insensitive one, for example like this:
- ALTER TABLE dbo.DimTestCase
- ALTER COLUMN AttributeKey NVARCHAR(150)
- COLLATE SQL_Latin1_General_CP1_CI_AS
- Create a view on the SQL Server side to change the collation to case insensitive using a statement similar to the following and replace the table in the data source view with this view:
- create view vDimTestCase as
- select [Key], AttributeKey COLLATE SQL_Latin1_General_CP1_CI_AS AttributeKey, AttributeName
- from dbo.DimTestCase
- change the collation for this key column (check case sensitive in the dialog above)
- leave the collation of this column to none (choose “none” in the combo box from the dialog above, if you already changed the collation) and set the collation on the dimension properties itself (this then defines the collation of all column binding in the dimension that don’t have an explicit collation-override)
- leave the collation of this column and for the dimension set to “none” and set the collation on the SSAS server to a case sensitive collation (server restart required). Be careful as has an effect on all the SSAS databases sitting on this server, unless the databases are not defining their own collation (see above).
As mentioned above, the same thing may happen with a accent sensitive collation on the SQL Server side and a accent insensitive collation on the SSAS side, for example with source data like this:
Long explanation for reason 6
This is a variation of case 3 of my previous post. If you’re using translations, they are handled like the Name property of your attribute and therefore they also have to be consistent. Take a look at the following table:
I’ve created an attribute that uses EnglishName as the key.
This works fine. By default, the key column is also used as the name column.
Now, we’re setting up a translation that uses the column GermanName as the translation for this attribute. Now, EnglishName is still the key, but GermanName gives two different results for the key value “Summer”: “Sommer” and “Fällt aus”. This causes the error. If you replace the German translation in line 5 with “Sommer” the dimension processes correctly again. In general, this error is very unlikely, because usually there will be some kind of translation table providing exactly one translation per original row. It still can happen, since I recently saw this in a project of mine.