Indexing Very Large Files in SharePoint 2010
October 28, 2010Posted by on
The limitation of indexing Files greater than 16 MB in size still exists in SharePoint 2010. The limitation was enforced to avoid network clogging and improve SharePoint performance.
In all such scenarios, where the file format is Compatible to SharePoint and the size of the file is larger than the max threshold size, the indexer is able to index the metadata of the file. however, the content of the file is not indexed.
For all files which are smaller than 16 MB in size and the file format is compatible to SharePoint, the content as well as the metadata of the file is indexed.
The major problem is that in SharePoint 2007, all such scenarios where the File Size was greater than 16 MB, were logged in as warning in the Crawl logs. However, there is no such warning generated in case of SharePoint 2010 and hence it is very difficult to understand the issue.
This whole behavior is guided by two registry values:
MaxDownloadSize : This specifies the maximum size of the document text that is filtered.
MaxGrowthFactor : This specified how large the output of the index filter can be.
By default , the MaxDownloadSize is 16 MB and the MaxGrowthFactor is 4 MB in Size. This basically implies that:
The maximum size when you index file share will be 16 * 6 = 64 MB
The maximum size when you index document on the web site will be 16 MB.
Before SharePoint 2010, this settings were modified by modifying the registry information at
HKEY_LOCAL_MACHINE -> SOFTWARE –> Microsoft –> Office Server -> 12.0 –> Search –> Global –> Gathering Manager
However for some reasons these settings don’t persist when applied in SharePoint 2010 environment. you need to alter the settings using the PowerShell.
The max value supported by “MaxDownloadSize” is 2 GB.